This PR addresses the TODO in train.py regarding exposing the optimizer params to toml/cmd line configuration.
The three settings added are weight decay, beta1 and beta2.
In addition, it now adds a single line of logging to show which optimizer and optimizer settings are being used for the run:
This PR addresses the TODO in train.py regarding exposing the optimizer params to toml/cmd line configuration.
The three settings added are weight decay, beta1 and beta2. In addition, it now adds a single line of logging to show which optimizer and optimizer settings are being used for the run: