pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.59k stars 350 forks source link

`aten.nonzero` #2516

Open peri044 opened 11 months ago

zewenli98 commented 11 months ago

This converter is pending on a TRT bug.

Reason: NonZeroLayer uses dynamic shape. TRT has a bug to run context = engine.create_execution_context() in this case, which causes the context is None and [TRT] [E] 1: Unexpected exception vector<bool>::_M_range_check: __n (which is 0) >= this->size() (which is 0)

Specifically, I use network.add_non_zero, the output of which is dynamic shape. Code snippet is as below:

import tensorrt as trt

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)

def build_engine():
    # Setup builder, network, config
    builder = trt.Builder(TRT_LOGGER)
    network = builder.create_network(EXPLICIT_BATCH)
    config = builder.create_builder_config()

    # Add inputs
    input = network.add_input("input", dtype=trt.float32, shape=(100, 64))
    layer = network.add_non_zero(input=input)

    # Mark matmul tensor as output
    network.mark_output(layer.get_output(0))

    return builder.build_engine(network, config)

def main():
    # Build TensorRT build_engine.
    engine = build_engine()
    context = engine.create_execution_context()
    print("context -------------->", context)  # the output is None

TRT team replied: It seems that the user must provide a profile if he uses DDS, even if all input shapes are static, see the case above. Otherwise some runtime checks in claimProfile fails. I think this is a bug as we should be able to create an implicit default optimization profile for such case.

zewenli98 commented 11 months ago

TensorRT team replied: It's going to be fixed in TRT 10

zewenli98 commented 7 months ago

In TRT-10, the INonzero Layer outputs dynamic shape, i.e., it includes -1, while the current Torch-TRT branch trt_10 reports error:

File "/home/zewenl/anaconda3/envs/trt-10-py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/zewenl/Documents/pytorch/TensorRT/py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py", line 258, in forward
    output = torch.empty(
RuntimeError: Trying to create tensor with negative dimension -1: [2, -1]
chohk88 commented 6 months ago

When implementing the converter for the nonzero operation in TRT 10.0.1, I encountered the same error as @zewenli98 did. Once the issue is resolved in TRT, we can continue development on this branch (https://github.com/pytorch/TensorRT/tree/aten_nonzero_converter)

======================================================================
ERROR: test_atan_float_0 (__main__.TestAtanConverter)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/hoonkyungc/miniconda3/envs/torch-trt-10/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 2741, in wrapper
    method(*args, **kwargs)
  File "/home/hoonkyungc/miniconda3/envs/torch-trt-10/lib/python3.9/site-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
  File "/home/hoonkyungc/workspace/TorchTRT/TensorRT/tests/py/dynamo/conversion/test_nonzero_aten.py", line 24, in test_atan_float
    self.run_test(
  File "/home/hoonkyungc/workspace/TorchTRT/TensorRT/tests/py/dynamo/conversion/harness.py", line 261, in run_test
    super().run_test(
  File "/home/hoonkyungc/workspace/TorchTRT/TensorRT/tests/py/dynamo/conversion/harness.py", line 82, in run_test
    outputs = trt_mod(*cuda_inputs)
  File "/home/hoonkyungc/miniconda3/envs/torch-trt-10/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/hoonkyungc/miniconda3/envs/torch-trt-10/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hoonkyungc/workspace/TorchTRT/TensorRT/py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py", line 214, in forward
    output = torch.empty(
RuntimeError: Trying to create tensor with negative dimension -1: [2, -1]