Closed kylebgorman closed 2 months ago
Looks good, thanks for thinking this through. I left two comments that may or may not require updates.
Could we combine this flag and the ones from #169? I.e., have a single --val_mode
flag that can be accuracy
, loss
, or none
, with the latter replacing the --no_save_best
option.
Also, for reference, HuggingFace calls this flag metric_for_early_stopping
and metric_for_best_model
. Since this covers both, maybe something like --primary_metric
would be a more obvious name?
be a more obvious name?
Let's take this up in a subsequent PR where we tweak names. I'm okay with "primary metric" I suppose.
Could we combine this flag and the ones from #169? I.e., have a single
--val_mode
flag that can beaccuracy
,loss
, ornone
, with the latter replacing the--no_save_best
option.
I actually tried that and it isn't as clear cut as you might imagine, because then you can't control which metric patience uses if you enable that. So I think the solution I've taken here is the best one: namely, documentation. I submit that there's no real reason why you'd want to use different model selection and patience (and reduce-on-plateau), so there's no reason we can't combine them.
I have made an executive decision and it is now: --save_top_k
as before, and --val_metric
as a compromise. You're all welcome to keep proposing changes to flag names though!
This allows one to either early stop/model select on loss or accuracy using the
--val_metric
flag. The default is as before: accuracy.This affects two things:
--save_top_k
model selection (and how the files are named), and the behavior of--patience
. I could also integrate it with the mode for the reduce-on-plateau scheduler, but I think that'll require a lot more code.It might seem at first glance that I stripped out the behavior associated with #169. However, as I document, you can simulate it easily just by doing
--save_top_k -1
, so I think that's actually fine.Closes #170.