Open antoinebrl opened 1 year ago
This is a good suggestion! We'd love to get a community PR on this issue :) otherwise, we will add to our roadmap though unfortunately it might not be done in the immediate future
@antoinebrl @mvpatel2000 once we pass the optimizers into the .fit() method, what do we want to do with them? I am happy to take a look and help out on this if I can get some guidance on how I use the optimizers here
@wlrd Sure! In short, you'll need to duplicate the setup we do in def __init__
for setting up optimizers starting here:
https://github.com/mosaicml/composer/blob/dev/composer/trainer/trainer.py#L988
I believe this includes just updating State
to use the new optimizer.
The next hard step is adding support for DeepSpeed and FSDP. You can see the DeepSpeed step here: https://github.com/mosaicml/composer/blob/dev/composer/trainer/trainer.py#L1004
For FSDP, it becomes much more complicated.... I think you can maybe just recreate the optimizer and repoint it to the model. I'm happy to scope this part out in more detail later.
I would recommend a series of 3 PRs:
Hello, The
.fit()
method of the Trainer is missing theoptimizers
parameters although it is part of its documentation.It would be nice to keep the optimizers and the schedulers together. As the schedulers depend on the dataloaders I would argue the
.fit()
method should take the optimizers too as input.