Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.17k stars 3.37k forks source link

Add `ddp_find_unused_parameters_true` strategy alias in Fabric #20037

Closed tesslerc closed 2 months ago

tesslerc commented 3 months ago

Bug description

The documentation (https://lightning.ai/docs/fabric/stable/api/fabric_args.html#strategy) states that to run DDP with "find_unused_parameters=True" we can use the strategy string "ddp_find_unused_parameters_true". This fails with Fabric.

What version are you seeing the problem on?

master

How to reproduce the bug

Fails: `Fabric(strategy="ddp_find_unused_parameters_true")`

Succeeds: `Fabric(strategy=DDPStrategy(find_unused_parameters=True))`

Error messages and logs

ValueError: You selected an invalid strategy name: `strategy='ddp_find_unused_parameters_true'`. It must be either a string or an instance of `lightning.fabric.strategies.Strategy`. Example choices: auto, ddp, ddp_spawn, deepspeed, dp, ... Find a complete list of options in our documentation at https://lightning.ai

Environment

Current environment ``` #- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow): #- PyTorch Lightning Version (e.g., 1.5.0): 2.3.1 #- Lightning App Version (e.g., 0.5.2): #- PyTorch Version (e.g., 2.0): #- Python version (e.g., 3.9): #- OS (e.g., Linux): #- CUDA/cuDNN version: #- GPU models and configuration: #- How you installed Lightning(`conda`, `pip`, source): pip #- Running environment of LightningApp (e.g. local, cloud): cloud ```

More info

No response

cc @justusschock @awaelchli

awaelchli commented 3 months ago

I think this was just forgotten. Let's add the alias for this strategy configuration. Contribution from anyone welcome of course!