Closed michaelpginn closed 2 months ago
Hmm, the black
format and flake8 rules seem to be in conflict as to whether long string lines should be broken. What's your preference?
Quick question here:
lightning let's yo evaluate/log either every n epochs, or every n steps IIRC. It might be logical to expect --save_all
to save a checkpoint each time we run evaluation. Might there be a way of configuring the callback to do this instead?
Hmm, the
black
format and flake8 rules seem to be in conflict as to whether long string lines should be broken. What's your preference?
Black just doesn't know how to break string literals. You have to do it yourself (then call black
one more time to make sure it's happy with how you do it).
Also
Another thing we could consider is to make early stopping configurable so that you can stop based on maximize validation accuracy or minimizing validation loss. That might address some of the same concerns, and it's a research question we might want to consider someday @Adamits.
Yeah we should definitely do this. Its especially useful when pretraining (where I expect we care about loss, not accuracy). I wonder if we get this already somehow in the lightning CLI interface? If not now, maybe once we upgrade to 2.0 (somehow I think I was supposed to do this > a year ago :D).
Also
Another thing we could consider is to make early stopping configurable so that you can stop based on maximize validation accuracy or minimizing validation loss. That might address some of the same concerns, and it's a research question we might want to consider someday @Adamits.
Yeah we should definitely do this. Its especially useful when pretraining (where I expect we care about loss, not accuracy). I wonder if we get this already somehow in the lightning CLI interface? If not now, maybe once we upgrade to 2.0 (somehow I think I was supposed to do this > a year ago :D).
Our project was also interested in using an alternative metric (chrF in our case). I would be happy to explore whether this is something that can be generalized robustly with lightning, if you like!
Also
Another thing we could consider is to make early stopping configurable so that you can stop based on maximize validation accuracy or minimizing validation loss. That might address some of the same concerns, and it's a research question we might want to consider someday @Adamits.
Yeah we should definitely do this. Its especially useful when pretraining (where I expect we care about loss, not accuracy). I wonder if we get this already somehow in the lightning CLI interface? If not now, maybe once we upgrade to 2.0 (somehow I think I was supposed to do this > a year ago :D).
See #170 for this.
Our project was also interested in using an alternative metric (chrF in our case). I would be happy to explore whether this is something that can be generalized robustly with lightning, if you like!
What's chrF?
What's chrF?
Essentially character-level bleu score (https://aclanthology.org/W15-3049/). Can potentially help when the data is very limited and accuracy is near 0, but the predictions may have correct substrings.
What's chrF?
Essentially character-level bleu score (https://aclanthology.org/W15-3049/). Can potentially help when the data is very limited and accuracy is near 0, but the predictions may have correct substrings.
I thought that's what it might mean. We'd welcome a PR to add that.
LGTM. @Adamits shall I merge?
LGTM.
There are cases where the user may not want to save checkpoints based on the dev accuracy. For example, if the training process has very low or unstable accuracy, using this metric to select checkpoints can result in selecting a suboptimal final model.
This PR simply adds a flag (
--save_best
and--no_save_best
) to enable naive saving, where a new checkpoint is saved every epoch. The default behavior remains the same and the naming convention follows other flags.