Lightning-AI / lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Apache License 2.0
1.18k stars 77 forks source link

Make trace `names` field directly linked to the names present in the trace #1274

Open riccardofelluga opened 2 weeks ago

riccardofelluga commented 2 weeks ago

🐛 Bug

As of now, if you run something like this:

import thunder

import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(10, 10),
            nn.ReLU(),
            nn.Linear(10, 10)
            )

    def forward(self, x):
        return self.layers(x)

a = torch.randn((10, 10), requires_grad=True, device="cuda")
m = MyModel().to("cuda")

jm = thunder.jit(m) 
jm(a)

print(thunder.last_traces(jm)[-1].names)
print(thunder.last_traces(jm)[-1].bound_symbols)

the list of names does not correspond with the names present in the trace.

Expected behavior

What should be is that the names are only the ones present in the trace and nothing more. This has effect on registering new symbols in the trace where names conflicts can happen even tho there are no bound symbols with the same name in the trace due to the check here:

https://github.com/Lightning-AI/lightning-thunder/blob/fceb64efc93a80a27d38b8e84f0e2b5f132f3d2f/thunder/core/trace.py#L170-L175

IvanYashchuk commented 1 week ago

The purpose of this attribute is to generate new unique names for Proxy objects. There are passes like CSE and DCE that remove operations from the trace. These passes that remove operations could also modify the .names attribute.

What are you trying to achieve or what problems do you face with the set in .names not being the same as the set of actually used proxy names?