Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.39k stars 3.38k forks source link

Redesign dataloader re-instantiation context managers for Lite #15179

Open awaelchli opened 2 years ago

awaelchli commented 2 years ago

Proposed refactor

Redesign the context managers that enable re-instantiations of DataLoader after #14992 deprecates the run method design.

Motivation

We currently have two context managers over the run method:

https://github.com/Lightning-AI/lightning/blob/776432fd7e34cc02e9dbfb16ad8ec8491c176dbf/src/lightning_lite/lite.py#L394-L397

These get applied for the user if they implement the run method. However, this won't be the case anymore after #14992. We need to find an alternative approach.

Pitch

We probably need to call these methods at the start of the program, for example in LightningLite.__init__. But it is unclear how to undo the patching which is usually done in __exit__. Happy to hear suggestions.

cc @borda @justusschock @awaelchli @tchaton @carmocca @otaj

carmocca commented 2 years ago

Can't we simply move them to LightningLite.setup_dataloaders?

edit: nope

awaelchli commented 2 years ago

I thought that we need to patch the class before the user instantiates a dataloader? Otherwise why would we need such a complicated re-instantiation logic? We are wrapping the init to capture the arguments. Wrapping this after the user has already instantiated objects will probably not be useful.

carmocca commented 2 years ago

Right. Then we might not want to undo the patching at all. It's the same design issue of "when should we teardown DDP" and such.

otaj commented 2 years ago

I think we need to adapt it completely. I don't think we necessarily care about unpatching, we just need to be sure that on re-instantiation we are calling original __init__ and not the patched one (which is taken care of automatically, since we are unpatching the method)

awaelchli commented 1 year ago

Alright, so for now, let's call _replace_dunder_methods at instantiation time of LightningLite. We then expect the user to instantiate the dataloader AFTER.

Caveat: What if someone builds a Trainer library with LightningLite like so:

model = ...
dataloader = DataLoader()

train(model, dataloader)  # the train function instantiates Lite and starts training. Too late for patching

Alternative: Is patching at import time feasible/reasonable?

carmocca commented 1 year ago

The caveat is a feature that we definitely want to support. This means that we do it at import time or ask the user to do it through lite dataloder = LightningLite.dataloader(...)

The former is the "magic" way of course. But for that we should make sure it's failsafe, idempotent, and that giving up the "un-patch"ing part is okay.

awaelchli commented 1 year ago

What's annoying me right now is that, since PL depends on Lite, the patching gets applied globally on import for PL too, and I can't avoid that :(

carmocca commented 1 year ago

Isn't that a feature? PL is a library that fits the criteria of

What if someone builds a Trainer library with LightningLite:

PL should remove the context manager after this change

awaelchli commented 1 year ago

I would have preferred to keep this change of behavior to Lite first. I am worried that there are unintended consequences we haven't thought about.

carmocca commented 1 year ago

We will need to face those consequences either way, so I would suggest starting now to uncover issues asap.