Closed MartinGOGO closed 4 years ago
Hi, the code looks OK to me. By default, the loss is summed over tokens, so longer sentences have larger loss. You can pass reduction='token_mean'
to have the loss averaged over tokens instead. It should be more stable.
Hi, the code looks OK to me. By default, the loss is summed over tokens, so longer sentences have larger loss. You can pass
reduction='token_mean'
to have the loss averaged over tokens instead. It should be more stable.
Thanks a lot. It looks better when I pass reduction='token_mean'
.
II'm interested in using this library for Named Entity Recognition,but something bad happend. I'm using Pytorch to build a model with one embedding layer, one lstm layer, and a crf layer. Model structure is shown below。
class my_model(nn.Module):
The question is that the LOSS is very high and very unstable during training. Loss often jump from 200+ to 20+, and then jump to 500+. I wonder if it's because I'm using this library incorrectly?