Open williamFalcon opened 2 years ago
Why do these need to be plugins? users can leverage these directly within their LightningModules
I believe the proposed API just mimics what composer offers:
trainer = composer.Trainer(
...
algorithms=[
BlurPool(replace_convs=True, replace_maxpools=True, blur_first=True),
ChannelsLast(),
CutMix(num_classes=10),
LabelSmoothing(smoothing=0.1),
]
)
(from their README)
The key difference here is that (AFAIK) composer
does not provide a "Module" abstraction such as the LightningModule
so it is natural for their library to put these directly in their trainer.
Also, looking at the source code, it does not look like they do any special management of these and just run all algorithms during every trigger event:
https://github.com/mosaicml/composer/blob/42271f8f6b10810d660318d17d037822beb05ee7/composer/core/engine.py#L177-L185 https://github.com/mosaicml/composer/blob/42271f8f6b10810d660318d17d037822beb05ee7/composer/core/engine.py#L192-L196
These could be directly part of the LightningModule
especially because they are part of the research and that's the where you usually put your research code. So unless we need to integrate with strategies or something specific to our internals, I agree with @ananthsub
@hanlint do you forsee any limitations that would be better adressed by making the algorithms part of the Trainer
?
Some "algorithms" may require overriding several hooks and managing state, but those could just be Callback
s that the user passes to the Trainer
.
@carmocca agreed with the approach. We designed the functional API so that users can utilize our methods inside their own training loops, so would be natural for users wishing to employ our efficiency methods to put in LightningModule
themselves, where the research code lives.
A few limitations I can think of:
before_
and after_
events so that they clean up their own effects. Or ensuring that SelectiveBackprop
runs first (see: https://github.com/mosaicml/composer/blob/0d7175573225001549e36ac91b2e3def250fa19f/composer/core/engine.py#L231). However, most of these are ease-of-use items that could be handled with good docs and warnings (e.g. we recently added algorithm warnings in https://github.com/mosaicml/composer/pull/720), as we harden our functional API.
Hey @hanlint I wanted to introduce you to Lightning Flash
In Flash, we have 2 custom objects there: Input and InputTransform used to organize data loading and data transform + the concept of Adapter (learn2learn example) to easily integrate thrid-party libraries.
Here is the API for learn2learn integration for example
I believe we could explore a Flash integration first, understand how we can address some of the limitations raised above, and upstream any core components which work for all users.
Thanks @tchaton for the pointer, I will take a look.
This library mosaic has neat tricks for optimizing the models for faster training. Each application is done as a single line to the model
Which is something we can automatically do for users under the hood if they want to enable the mosaic optimizations.
I propose an API like this
cc @borda @akihironitta @Borda @carmocca @tchaton