mila-iqia / cookiecutter-pyml

MIT License
19 stars 8 forks source link

Factory #100

Closed marondeau-mila closed 9 months ago

marondeau-mila commented 1 year ago

Start of factorization

The goal is to separate the creation of object from their use. This would reduce the boilerplate than need to change for the project, and simplify configuration. Since the optimizer and scheduler are complex, and their creation must follow strict rules under Lightning, they were factorized first.

Boilerplate

With factories, we can trivially change the algorithm, and associated hyper-parameters without having to change the modules: the factory knows the details, and the module knows how to call the factory. Changing the algorithm only requires changing things once: where the configuration is loaded. Even better: checkpoints will be trivial to load, since the factory is saved with the model.

Configuration

This change removes the need for configuration deep in the modules. This means that using OmniConf, Hydra, a complex set of command line argument, or some HTTP GET based abomination can be done without having to change the Module. This also means that the Module does not need to know about the hyper-parameters. We effectively move them around inside the factory.

Limitations and Future Work

In general, using factories will simplify the inner part of the project: the Module and training loop. We still need to manage configuration, to create the factory. This is a current challenge with the template, which won't suddenly disappear with factories. Factories would help addressing that challenge, by decoupling the configuration from the Module.

The factory approach would also be useful for the loss and model. This PR includes some preliminary refactorization in that direction, but the main goal is to get feedback on the structure and use cases.