kmkurn / pytorch-crf

(Linear-chain) Conditional random field in PyTorch.
https://pytorch-crf.readthedocs.io
MIT License
935 stars 151 forks source link

Same token is predicted at each step during decoding #64

Closed xashru closed 4 years ago

xashru commented 4 years ago

Hi, I am using pytorch-crf for token prediction task with a LSTM network. When I use a fully connected layer after lstm it works fine.

x, _ = self.lstm(...)
x = self.linear(x)

This is trained using nn.CrossEntropyLoss loss in PyTorch.

Now, I want to add a CRF layer for a sequence prediction task.

x, _ = self.lstm(...)
x = self.linear(x)
crf_out = self.crf.forward(x, y, masks, reduction='token_mean')

-crf_out is used as loss to train the network. Decoding is done using dec_out = self.crf.decode(x, masks) However, this only predicts one category (which hast the maximum occurrence in the data). Perhaps I should mention that the dataset is heavily imbalanced and one target token consists of 85% of all tokens. Loss decreases during training.

xashru commented 4 years ago

This was caused by a bug in my code when calculating the result. This issue can be deleted.

MoxinC commented 3 years ago

Hi! I also encountered the similar issue as the one you mentioned, could you please tell somethings more details of how you fix the bug? Thanks a lot.