Lightning-AI / lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Apache License 2.0
1.07k stars 60 forks source link

CUDAGraphs as executor/transform/fusion pass #656

Closed nikitaved closed 2 days ago

nikitaved commented 3 days ago

As per title. Fixes https://github.com/Lightning-AI/lightning-thunder/issues/635.

Also, it fixes the following subtle bugs:

IvanYashchuk commented 2 days ago

Nice! This change is needed to make my PR https://github.com/Lightning-AI/lightning-thunder/pull/214 work with CUDA Graphs correctly. Because there I try to put torch.autograd.Function.apply into the forward trace but it should be executed outside of the CUDA Graph-captured region.

t-vi commented 2 days ago

Let's merge and fix anything that needs fixing later.