wellcometrust / deep_reference_parser

A deep learning model for extracting references from text
MIT License
24 stars 1 forks source link

Future of CRF layer #28

Open ivyleavedtoadflax opened 4 years ago

ivyleavedtoadflax commented 4 years ago

The CRF causes some problems, namely:

Replacing the CRF some other output would ameliorate these problems. Note that it is already possible to remove it right now by specifying output="softmax" rather than "crf" when building the model with deep_reference_parser.build_model(). A softmax output will almost certainly perform worse than a CRF though.

ivyleavedtoadflax commented 4 years ago

Looks like CRF is available in tf 2.0 https://www.tensorflow.org/addons/api_docs/python/tfa/text/crf

ivyleavedtoadflax commented 4 years ago

Here's an example, but note that the CRF module is now in tfa.text.crf, not contrib: https://github.com/OpenNMT/OpenNMT-tf/blob/master/opennmt/models/sequence_tagger.py

ivyleavedtoadflax commented 4 years ago

Ahh it is implemented for tf but not for tf.keras, though looks like it could be coming: https://github.com/tensorflow/addons/pull/377#pullrequestreview-335963486

ivyleavedtoadflax commented 4 years ago

This has been merged: https://github.com/tensorflow/addons/pull/1999

chaalic commented 3 years ago

Hello, when using CRF layer with BI-LSTM for an NER task, i get the following error : crf_loss * crf, idx = y_pred._keras_history[:2]

AttributeError: 'Tensor' object has no attribute '_keras_history'

I get that it's a problem in the loss function, but I don't know how to get past it. Could you please help if you have found a solution ?

ivyleavedtoadflax commented 3 years ago

Hi @chaalic is this code you are running outside of the deep reference parser?

chaalic commented 3 years ago

Yes it's for another task of named entity recognition, but the model i'm using is the same : bilstm with CRF.

ivyleavedtoadflax commented 3 years ago

Ah OK. If you post some more of your code here we may be able to spot something.