guillaumegenthial / tf_ner

Simple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data
Apache License 2.0
923 stars 275 forks source link

Label probabilities for the CRF layer #38

Open ashim95 opened 5 years ago

ashim95 commented 5 years ago

Hi, Thanks for sharing this great implementation. I know it is possible to get the label probabilities using forward backward algorithm in CRFs. I am finding some difficulties in implementing/modifying the default CRF implementation in tensorflow. For calculation of the partition function, they have only used the forward (message passing) algorithm. Do you have any experience or ideas about how the forward-backward algorithm could be implemented in tf?

guillaumegenthial commented 5 years ago

Hi @ashim95, This is a very good point, to compute the marginals (per token tag probability) you would indeed need to modify the tensorflow CRF implementation and use the Foward / Backward algorithm. I did not do it, so I cannot give you estimate of the complexity of the task, but it seems like the Forward cell is already implemented in tensorflow.

For reference, the crfsuite uses the F/B approach to compute the marginals it seems https://github.com/chokkan/crfsuite/blob/dc5b6c7b726de90ca63cbf269e6476e18f1dd0d9/lib/crf/src/crf1d_context.c

If you're up to it, we could collaborate and open a PR in Tensorflow adding this support.

Let me know.

agarwalishan commented 4 years ago

I have requested this feature on issues page of tensorflow. https://github.com/tensorflow/tensorflow/issues/42178 Please comment there and try to upvote it so that they can add this functionality.

ashim95 commented 4 years ago

Hi @agarwalishan I would suggest you look into pytorch based implementations of CRF. Many of these output marginal probabilities. Even if you can not find such an implementation on github, it should be much easier to implement in pytorch than tensorflow.

agarwalishan commented 4 years ago

Thanks for the reply but I want tensorflow implementation only.