Closed Youarerare closed 5 years ago
I found the cause of the problem. I have not dealt with the transpose of the matrix. your code is correct。
I found the cause of the problem. I have not dealt with the transpose of the matrix. your code is correct。
Hello, I have met the same problem with you. Would you please state your problem of ‘transpose’ in detail to help me check my own problem? Thank you!
Hello,At that time, I put the standard label as a batch_size seq_len matrix, and my prediction was a seq_len batch_size matrix. In fact, the network was trained, but the label did not match when calculating the accuracy or f1. (I converted both matrices into a one-dimensional vector to calculate the accuracy) And That's why the accuracy rate can increase when using batch = 1. You may check your label and your prediction 's shape .
Hi, I have met the same problem as issue # 40. I use a Bilstm+CRF to do NER tasks, the loss decreases and the f1 remains 0.12. I find the outputs of the CRF layer are almost all O( a label of ner). So I change my batch size from 8 to 1, then the F1 score increases to 0.91.
Now my model can work well when batch size is 1,I am not sure if there is a problem with the CRF loss function. can you give me some help?