pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.5k stars 344 forks source link

torch_tensorrt.dynamo.compile saved exported programs cannot be loaded #3108

Open kacper-kleczewski opened 3 weeks ago

kacper-kleczewski commented 3 weeks ago

Bug Description

Models exported with torch.export.export, saved, loaded and then compiled with torch_tensorrt.dynamo.compile cannot be loaded with torch.export.load with error:

W0820 14:11:53.628000 139673176707712 torch/fx/experimental/symbolic_shapes.py:4424] s0 is not in var_ranges, defaulting to unknown range.
E0820 14:11:53.628000 139673176707712 torch/fx/experimental/recording.py:280] failed while running evaluate_expr(*(s0 >= 0, True), **{'fx_node': None})

To Reproduce

Code below should reproduce issue. It can be also observed with more complex model like EfficientNet.

import torch
import troch_tensorrt

model = torch.nn.Linear(5, 7).eval()
sample = torch.randn(3, 5)

ep = torch.export.export(model, sample)
torch.export.save(ep, "model.ep")

ep_loaded = torch.export.load("model.ep")
compiled = torch_tensorrt.dynamo.compile(ep_loaded, [sample])

torch_tensorrt.save(compiled, "model_compiled.ep", inputs=[sample])

loaded_torch_tensorrt = torch.export.load("model_compiled.ep")

Expected behavior

Succesfull loading of the model.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Nvidia PyTorch container 24.07

peri044 commented 3 weeks ago

Hello @kacper-kleczewski , I tried your script with the main branch (which is on 2.5.0.dev20240822+cu124) and it works fine. Here's the slightly modified script that I tried

import torch
import torch_tensorrt

model = torch.nn.Linear(5, 7).eval().cuda()
sample = torch.randn(3, 5).cuda()
pyt_out = model(sample)
ep = torch.export.export(model, (sample,))
torch.export.save(ep, "model.ep")

ep_loaded = torch.export.load("model.ep")
compiled = torch_tensorrt.dynamo.compile(ep_loaded, [sample], min_block_size=1)

torch_tensorrt.save(compiled, "model_compiled.ep", inputs=[sample])

loaded_torch_tensorrt = torch.export.load("model_compiled.ep")
trt_gm = loaded_torch_tensorrt.module()
trt_out = trt_gm(sample)

print("Diff: ", torch.mean(torch.abs(pyt_out-trt_out)))

I remember we had some serialization issues with 2.4 version of torch which were resolved recently.