mir-group / nequip

NequIP is a code for building E(3)-equivariant interatomic potentials
https://www.nature.com/articles/s41467-022-29939-5
MIT License
565 stars 124 forks source link

Reduce LR on plateau but not increase #326

Closed peastman closed 1 year ago

peastman commented 1 year ago

The ReduceLROnPlateau option actually looks for the loss to increase, not plateau. As long as it doesn't increase, the learning rate doesn't change.

In training my model I never see the loss increase. It just keeps decreasing by tinier and tinier amounts. Could we add a margin to the test, so for example I could tell it to reduce the learning rate any time the loss decreases by less than 2%?

Linux-cpp-lisp commented 1 year ago

I think there is an option for this in PyTorch, though I ahven't looked super closely. See is_better in https://pytorch.org/docs/1.11/_modules/torch/optim/lr_scheduler.html#ReduceLROnPlateau.

Something to do with setting threshold?

You can specify these options in the YAML with lr_scheduler_* keys.

peastman commented 1 year ago

That's exactly what I need. These options aren't listed in https://github.com/mir-group/nequip/blob/main/configs/full.yaml. Do we need code changes to support them? Or will it automatically translate any lr_scheduler_X option into the X argument of the scheduler?

Linux-cpp-lisp commented 1 year ago

It should do it automatically; the configuration system is built to automatically propagate options from their hierarchical prefixes into the right objects. You can confirm this by running briefly in verbose: debug mode which explicitly logs the mapping from input keys to various objects being built to see if the right values are set when ReduceLROnPlateau gets instantiated (just grep ReduceLROnPlateau in the log).

simonbatzner commented 1 year ago

This should already be implemented, you can set a delta on the improvement, see here and here. Let us know if you have questions re this.

peastman commented 1 year ago

This should already be implemented, you can set a delta on the improvement, see here and here.

That refers to early stopping, not to reducing the learning rate.

simonbatzner commented 1 year ago

Oh my bad, of course, misread.

peastman commented 1 year ago

Setting lr_scheduler_threshold works perfectly. Thanks! Can we document in configs/full.yaml that you can specify any argument to the scheduler?

Linux-cpp-lisp commented 1 year ago

Done 👍