CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning
Apache License 2.0
25 stars 15 forks source link

Adds --val_mode. #171

Closed kylebgorman closed 2 months ago

kylebgorman commented 2 months ago

This allows one to either early stop/model select on loss or accuracy using the --val_metric flag. The default is as before: accuracy.

This affects two things: --save_top_k model selection (and how the files are named), and the behavior of --patience. I could also integrate it with the mode for the reduce-on-plateau scheduler, but I think that'll require a lot more code.

It might seem at first glance that I stripped out the behavior associated with #169. However, as I document, you can simulate it easily just by doing --save_top_k -1, so I think that's actually fine.

Closes #170.

Adamits commented 2 months ago

Looks good, thanks for thinking this through. I left two comments that may or may not require updates.

michaelpginn commented 2 months ago

Could we combine this flag and the ones from #169? I.e., have a single --val_mode flag that can be accuracy, loss, or none, with the latter replacing the --no_save_best option.

michaelpginn commented 2 months ago

Also, for reference, HuggingFace calls this flag metric_for_early_stopping and metric_for_best_model. Since this covers both, maybe something like --primary_metric would be a more obvious name?

kylebgorman commented 2 months ago

be a more obvious name?

Let's take this up in a subsequent PR where we tweak names. I'm okay with "primary metric" I suppose.

kylebgorman commented 2 months ago

Could we combine this flag and the ones from #169? I.e., have a single --val_mode flag that can be accuracy, loss, or none, with the latter replacing the --no_save_best option.

I actually tried that and it isn't as clear cut as you might imagine, because then you can't control which metric patience uses if you enable that. So I think the solution I've taken here is the best one: namely, documentation. I submit that there's no real reason why you'd want to use different model selection and patience (and reduce-on-plateau), so there's no reason we can't combine them.

kylebgorman commented 2 months ago

I have made an executive decision and it is now: --save_top_k as before, and --val_metric as a compromise. You're all welcome to keep proposing changes to flag names though!