Closed jamesdbaker closed 4 years ago
This is most likely due to the way Transformer models work. There's a tendency for the model to "break" when it's overtrained. I'm not sure why this happens, but it seems more common with the larger models (and higher training epoch numbers), so it's probably overfitting of some sort. If you plot your losses, you will be able to see if this is happening.
_You can easily plot the training progress by specifying wandb_project
in your model_args
_
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Describe the bug On some models, I get the following warning and then an evaluation of 0 true positives and 0 false positives. If I use a smaller version of the same model, I do get true and false positives - so I assume this is something to do with larger models. It has happened on a couple of different models.
The warning I'm getting is:
The full output is:
To Reproduce
I'm using the following code (sorry, the training data isn't shareable):
Expected behavior
I would expect there to be non-zero true/false positives, given that the smaller models do return these. Therefore I'm assuming this is a bug.
Desktop (please complete the following information):