ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.09k stars 1.19k forks source link

Conditional Random Field (CRF) Sequence Decoder #494

Open BeWe11 opened 5 years ago

BeWe11 commented 5 years ago

The current example for Name Entity Extraction is given as the following model definition:

input_features:
    -
        name: utterance
        type: text
        level: word
        encoder: rnn
        cell_type: lstm
        reduce_output: null
        preprocessing:
          word_format: space

output_features:
    -
        name: tag
        type: sequence
        decoder: tagger

This works ok, but if Ludwig had a CRF-Decoding layer, one could build state-of-the-art Bi-LSTM+CRF models (like https://arxiv.org/pdf/1508.01991.pdf) in a single simple Ludwig model definition. Any chance CRF decoding is coming to Ludwig any time soon?

w4nderlust commented 5 years ago

The difference between using a CRF and not using it are pretty small and the computational cost added (quadratic in the size of the labels) is not always worth it. That said, adding a CRF decoder would be definitely useful as an option and pretty straightforward. An example of how to do it using the the CRF package from tf.contrib is here: https://github.com/guillaumegenthial/tf_ner/blob/master/models/lstm_crf/main.py#L94 Would you consider contributing it? I can help you out pointing you to the right part of the codebase to modify for doing it, it will be pretty easy.

kvthr commented 5 years ago

@w4nderlust @BeWe11 I can help in adding the CRF functionality to the decoder layer.

jenishah commented 4 years ago

@w4nderlust I am interested in adding this feature. Can you please guide me towards the right files where we need to make changes?

Thanks!

jenishah commented 4 years ago

@w4nderlust I am interested in adding this feature. Can you please guide me towards the right files where we need to make changes?

Thanks!

Add a class CRFTagger in sequence_decoders ?

w4nderlust commented 4 years ago

Add a class CRFTagger in sequence_decoders ?

Yes that would be great. Consider that we are planning to move to TF2 soon, so at the moment you can use the contrib package, but make sure that you'll be able to port to the addons package.

jenishah commented 4 years ago

Working on it

jenishah commented 4 years ago

I think we do not need to calculate loss using _tf.nn.sampled_softmaxloss. In that case we do not have to find a way to get _classweights and _classbiases.

Please let me know if I am wrong.

jenishah commented 4 years ago

Also, Should I write a different loss function ?

w4nderlust commented 4 years ago

You should probably be using tf.contrib.crf.crf_log_likelihood. You can refer to this as a reference for implementation: https://github.com/guillaumegenthial/tf_ner