Conditional Random Field (CRF) Sequence Decoder

ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

http://ludwig.ai

Apache License 2.0

11.09k stars 1.19k forks source link

Conditional Random Field (CRF) Sequence Decoder #494

Open BeWe11 opened 5 years ago

BeWe11 commented 5 years ago

The current example for Name Entity Extraction is given as the following model definition:

input_features:
    -
        name: utterance
        type: text
        level: word
        encoder: rnn
        cell_type: lstm
        reduce_output: null
        preprocessing:
          word_format: space

output_features:
    -
        name: tag
        type: sequence
        decoder: tagger

This works ok, but if Ludwig had a CRF-Decoding layer, one could build state-of-the-art Bi-LSTM+CRF models (like https://arxiv.org/pdf/1508.01991.pdf) in a single simple Ludwig model definition. Any chance CRF decoding is coming to Ludwig any time soon?

w4nderlust commented 5 years ago

The difference between using a CRF and not using it are pretty small and the computational cost added (quadratic in the size of the labels) is not always worth it. That said, adding a CRF decoder would be definitely useful as an option and pretty straightforward. An example of how to do it using the the CRF package from tf.contrib is here: https://github.com/guillaumegenthial/tf_ner/blob/master/models/lstm_crf/main.py#L94 Would you consider contributing it? I can help you out pointing you to the right part of the codebase to modify for doing it, it will be pretty easy.