Torch compile support for distributed operations

Lightning-AI / lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Apache License 2.0

1.2k stars 80 forks source link

Torch compile support for distributed operations #1146

Open AugustDev opened 2 months ago

AugustDev commented 2 months ago

🚀 Feature

Documentation says that torch compile is not supported over distributed training right now. Since torch compile can speed up training as much as 2x using Lightning Trainer is without compile is no longer cost efficient and it would be great to support it.

It's bit unclear to me what happens if I compile the model before passing to Lightning module, will it be used as compiled model over DDP or not?

t-vi commented 2 months ago

@AugustDev Thank you, did you want to file this here or with https://github.com/Lightning-AI/pytorch-lightning/issues ?

awaelchli commented 2 months ago

The approach we took in Fabric should be transferrable to Trainer as well: https://github.com/Lightning-AI/pytorch-lightning/pull/19280 https://github.com/Lightning-AI/pytorch-lightning/pull/19382 Essentially, it is just ensuring that torch.compile is applied over the FSDP/DDP wrapped model.