NLPatVCU / medaCy

:hospital: Medical Text Mining and Information Extraction with spaCy
GNU General Public License v3.0
432 stars 91 forks source link

Is it possible to enforce constraint on transition matrix of CRF? #212

Open lkqnaruto opened 2 years ago

lkqnaruto commented 2 years ago

Hi

I wonder Is it possible to enforce constraint on transition matrix of CRF? For example, given BIO scheme, O -> I should not happen in practice, so is it possible to enforce such constraint on transition matrix?

Thanks!

swfarnsworth commented 2 years ago

Hello, are you referring to the CRF learner itself, or either of the CRF layers in the BiLSTM and BERT?

If the former, we are using the sklearn's CRF implementation (see here), so it may be up to what functionality their implementation supports.

lkqnaruto commented 2 years ago

I'm referring to the CRF layer in the BiLSTM and BERT

swfarnsworth commented 2 years ago

They both use the CRF implemented in pytorch-crf. Does it appear to natively support what you are trying to do?

lkqnaruto commented 2 years ago

They both use the CRF implemented in pytorch-crf. Does it appear to natively support what you are trying to do?

I don't think it is support what I want to do, in pytorch-crf package, the author initialized the transition matrix without any constraint (I think)

lkqnaruto commented 2 years ago

They both use the CRF implemented in pytorch-crf. Does it appear to natively support what you are trying to do?

I wonder in pytorch-crf, the row index represents the current state and column index represents the next state in CRF transition matrix? or the other way around? I'm actually very confused about that.

swfarnsworth commented 2 years ago

I am not sure. If you create or obtain a torch-compatible CRF implementation that does what you need, I may be able to discuss with you how to use it in medaCy, or at least point out where in the code base things would need to be changed.

lkqnaruto commented 2 years ago

I am not sure. If you create or obtain a torch-compatible CRF implementation that does what you need, I may be able to discuss with you how to use it in medaCy, or at least point out where in the code base things would need to be changed.

Thank you, but in the medaCy crf layer, there is no constraint on the transition matrix, right? But If I want to enforce such constraint, where should I modify the code?

swfarnsworth commented 2 years ago

None of the CRFs used in medaCy are implemented within medaCy, so the only changes one would make to medaCy code would be replacing the CRF implementations imported from its dependencies with one that does what is wanted.

In other words, you would probably have to modify the pytorch-crf code.

If you are able to get that far, please let me know and we can discuss how to switch that alternative CRF with those used in medaCy.

lkqnaruto commented 2 years ago

None of the CRFs used in medaCy are implemented within medaCy, so the only changes one would make to medaCy code would be replacing the CRF implementations imported from its dependencies with one that does what is wanted.

In other words, you would probably have to modify the pytorch-crf code.

If you are able to get that far, please let me know and we can discuss how to switch that alternative CRF with those used in medaCy.

Yea, I'm currently trying to use medaCy to do NER task on my dataset, but results not quite good. And I saw some cases like O -> I in the prediction. So I wanna use CRF with some constraint so that I can further improve the performance. I think I'm going to modify the pytorch-crf code, but I just not quite sure how to do it and confused about the index. Hope you can help, thank you in advance.