pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.57k stars 350 forks source link

🐛 [Bug] Exporting `OptimizedModule` compiled model #2742

Open AvivSham opened 6 months ago

AvivSham commented 6 months ago

Bug Description

Hi All, How are you? I'm trying to save complied module but encountering errors. After I compiled my model I noticed that its type is OptimizedModule. Following this guide I was not sure how to save the model since there is no reference to OptimizedModule type. We tried to export the model using the following call trt_exp_program = torch_tensorrt.dynamo.export(trt_mod, [dummy_x.half(), dummy_embed.half()], "ep") but ended up with AttributeError: 'Model' object has no attribute 'graph'. What is the correct way to save and load OptimizedModule?

Environment

We are using g4dn was machine with the following env:

AvivSham commented 6 months ago

@gs-olive any chance you can help with this?

peri044 commented 6 months ago

OptimizedModule is an artifact of torch.compile compilation. We don't have a way to save those models. Instead, ir=dynamo is the way to go if you want to serialize models.

The following isn't the recommended workflow. torch_tensorrt.dynamo.export can only take torch.fx.GraphModule as its input.

trt_exp_program = torch_tensorrt.dynamo.export(trt_mod, [dummy_x.half(), dummy_embed.half()], "ep")

The docs you linked isn't updated (we are working on fixing the docs). Please refer to this file https://github.com/pytorch/TensorRT/blob/release/2.2/docsrc/user_guide/saving_models.rst

AvivSham commented 6 months ago

Hi @peri044,

I tried your suggestion (using dynamo as the ir_mode) but ended up with the following error: torch._dynamo.exc.UserError: Dynamic control flow is not supported at the moment. Please use functorch.experimental.control_flow.cond to explicitly capture the control flow.

dynamo compile does not act like jit.script? shouldn't it handle logic conditions? What if I want it to act like jit.trace where only the flow activated for the specific input is compiled?

Just a reminder using torch_compile as ir_mode compiles the model but produces OptimizedModule.

If it helps I can share my code and also promoting the pytorch version is an option (if this issue is fixed/supported in later versions).

AvivSham commented 5 months ago

@gs-olive @peri044 Can you please help?

gs-olive commented 5 months ago

Hi - to my knowledge, Torch's Dynamo export tracer works slightly differently to torch.jit.trace, in that torch.jit.trace generally traces only the control flow for the specific input, as you mentioned, and effectively collapses the if/else statements. Dynamo export's tracer can sometimes try to trace without modifying the conditional, which leads to errors like this where it requires a specific tracer-friendly conditional to work.

I believe there are tools to do this "conditional-flattening" in advance - maybe something like torch.fx.experimental.proxy_tensor.make_fx or torch.fx.symbolic_trace with concrete_args specified. Then, once traced with control flow removed, this FX graph can be passed to torch_tensorrt.compile. @peri044 - do you know of any other methods for accomplishing this?

Providing the source code/sample model would be helpful for us to debug further as well.

peri044 commented 3 months ago

@AvivSham If you use torch.compile, the control flow would cause graph breaks. If you use torch.export, there are some ways to get the control flow captured eg: https://pytorch.org/tutorials/intermediate/torch_export_tutorial.html#control-flow-ops. If you change your control flow code and express it as per torch's recommendations, the export might work and the graphs could then be processed by Torch-TensorRT.