kmkurn / pytorch-crf

(Linear-chain) Conditional random field in PyTorch.
https://pytorch-crf.readthedocs.io
MIT License
950 stars 152 forks source link

The f1 score only increases when the batch size =1 #45

Closed Youarerare closed 5 years ago

Youarerare commented 5 years ago

Hi, I have met the same problem as issue # 40. I use a Bilstm+CRF to do NER tasks, the loss decreases and the f1 remains 0.12. I find the outputs of the CRF layer are almost all O( a label of ner). So I change my batch size from 8 to 1, then the F1 score increases to 0.91.

Now my model can work well when batch size is 1,I am not sure if there is a problem with the CRF loss function. can you give me some help?

Youarerare commented 5 years ago

I found the cause of the problem. I have not dealt with the transpose of the matrix. your code is correct。

Youarerare commented 4 years ago

I found the cause of the problem. I have not dealt with the transpose of the matrix. your code is correct。

Hello, I have met the same problem with you. Would you please state your problem of ‘transpose’ in detail to help me check my own problem? Thank you!

Hello,At that time, I put the standard label as a batch_size seq_len matrix, and my prediction was a seq_len batch_size matrix. In fact, the network was trained, but the label did not match when calculating the accuracy or f1. (I converted both matrices into a one-dimensional vector to calculate the accuracy) And That's why the accuracy rate can increase when using batch = 1. You may check your label and your prediction 's shape .