luozhouyang / keras-crf

A more elegant and convenient CRF built on tensorflow-addons.
Apache License 2.0
27 stars 3 forks source link

Label encoded y_true but one-hot encoded y_pred #4

Closed veddandekar closed 3 years ago

veddandekar commented 3 years ago

Firstly, thank you for this implementation of CRF. It has helped a lot with my project! I am a little confused as to whether during training, the model expects one-hot encoded labels or does it directly output class labels.

Specifically, each of my input is an array of 9 elements. The output labels I am passing to the model are label encoded (either 0 or 1), one for each of the input element of the array.

eg: input = [23, 43, 34, 67, 34, 76, 65, 234, 124 ] labels = [ 1, 0, 0, 0, 1, 0, 1, 0, 1 ]

Here, the labels are not one-hot encoded, meaning label[i] corresponds to input[i].

X shape = (x, 9) y_true shape = (x, 9)

Training completes with the above successfully. However, during prediction, my outputs seem to be one-hot encoded, ie y_pred shape = (x, 9, 2)

Is taking encoded labels as y_true during training but predicting one-hot encoded labels the expected behaviour or have I misunderstood something?

luozhouyang commented 3 years ago

The crf model outputs are a little complex. When training is True, crf outputs a potentials, whose shape is [batch_size, seq_len, num_tags]. But when training is False, crf outputs a decoded_sequence, whose shape is [batch_size, seq_len], but I encode the decoded_sequence using one-hot encoding to keep the shape consistent. However, the groud truth label in training data does not need to do one-hot encoding.

luozhouyang commented 3 years ago

You need to apply argmax to the decoded sequence in prediction mode to get the predicted sequence.