Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.51k stars 3.39k forks source link

Bring Hydra improvements to Lite #14799

Closed awaelchli closed 2 years ago

awaelchli commented 2 years ago

Proposed refactor

Apply/share changes in #11617 with the Lite implementation.

Motivation

Since both are using the same launcher, the improvements should be integrated for both.

Pitch

Move the _hydra_subprocess_cmd functions to lightning_lite.strategies.lauchers.subprocess, and share the implementation in PL.

Additional context

11617 was developed in parallel to standalone Lite efforts, so we didn't have time to adjust.


If you enjoy Lightning, check out our other projects! ⚡

Atharva-Phatak commented 2 years ago

Hi @awaelchli, I am interested in this, is this up for the taking? Also could provide some additional in terms of whats expected in the PR ?

awaelchli commented 2 years ago

@Atharva-Phatak Yes, sure. Once #11617 is merged feel free to work on it. The PR should just integrate the same changes in this file here: https://github.com/Lightning-AI/lightning/blob/master/src/lightning_lite/strategies/launchers/subprocess_script.py

Atharva-Phatak commented 2 years ago

@awaelchli Thanks for the update. I will work on it once #11617 has merged.

carmocca commented 2 years ago

It is merged now :tada:

Atharva-Phatak commented 2 years ago

Thanks for letting me know. I will get started.

Atharva-Phatak commented 2 years ago

Hi, quick question, I was seeing the #11617 looks like it added Hydra support for multirun. I believe most changes were made to the launcher (https://github.com/Lightning-AI/lightning/blob/45ca78167efaa98f5e78ca73d79d4e71946db253/src/pytorch_lightning/strategies/launchers/subprocess_script.py) and for this issue, I have to apply the same changes to lightning Lite right ?

awaelchli commented 2 years ago

@Atharva-Phatak Yes, my idea is that the new functions (e.g. _hydra_subprocess_cmd) can just me moved to Lite, then imported in PL to share the code. Then all is left call the function in the Lite version of the launcher as well, as it was done in the PR.