pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.61k stars 351 forks source link

🐛 [Bug] require_full_compilation=True has no effect #3246

Closed braindevices closed 1 month ago

braindevices commented 1 month ago

Bug Description

require_full_compilation (bool): Require modules to be compiled end to end or return an error as opposed to returning a hybrid graph where operations that cannot be run in TensorRT are run in PyTorch

But when require_full_compilation=True it still generate hybrid graph

To Reproduce

Steps to reproduce the behavior:

run following code, we can see instead of error out, it still generate hybrid graph

import torch
from torch import nn
class dummy_t(nn.Module):
    def __init__(self) -> None:
        super().__init__()
    def forward(self, x: torch.Tensor):
        return x.clamp_(0, 1).mul_(255).to(dtype=torch.uint8)
xs = [torch.randn((1,3,5,7)).cuda()]
exported = torch.export.export(
    dummy_t().cuda(),
    args=tuple(xs)
)
exported.module()(*xs)
import torch_tensorrt
trt_fx = torch_tensorrt.dynamo.compile(
    exported,
    assume_dynamic_shape_support=False,
    inputs=tuple(xs),
    use_python_runtime=False,
    enabled_precisions={torch.float32},
    use_fast_partitioner=False,
    # debug=True,
    min_block_size=1,
    require_full_compilation=True
)

for i, m in enumerate(trt_fx.modules()):
    print(i, m, hasattr(m, "engine"))

Expected behavior

should warn:

The following nodes are currently set to run in Torch:
Node: torch.ops.aten._to_copy.default, with layer location: __/_to_copy
Node: torch.ops.aten.copy_.default, with layer location: copy__default
Note: Some of the above nodes may be supported, but were not included in a TRT graph by the partitioner

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

narendasan commented 1 month ago

Fixed by #3193