NVIDIA / TensorRT-Incubator

Experimental projects related to TensorRT
69 stars 11 forks source link

Casting a tensor created with iota to int64 throws exception #116

Open Mgluhovskoi opened 1 month ago

Mgluhovskoi commented 1 month ago
data = tp.iota((3, 3, 2))
data = tp.cast(data, dtype=tp.int64)
print(data)

Throws the following error:

Traceback (most recent call last):
  File "/tripy/debugging_gather.py", line 40, in <module>
    print(data)
  File "/tripy/tripy/frontend/tensor.py", line 228, in __repr__
    arr = self.data()
  File "/tripy/tripy/frontend/tensor.py", line 218, in data
    arr = self.eval()
  File "/tripy/tripy/frontend/tensor.py", line 200, in eval
    executable = compiler.compile(mlir, flat_ir=flat_ir)
  File "/tripy/tripy/utils/utils.py", line 73, in wrapper
    result = func(*args, **kwargs)
  File "/tripy/tripy/backend/mlir/compiler.py", line 107, in compile
    map_error_to_user_code_and_raise(flat_ir, exc, stderr.decode())
  File "/tripy/tripy/backend/mlir/utils.py", line 440, in map_error_to_user_code_and_raise
    raise_error(
  File "/tripy/tripy/common/exception.py", line 195, in raise_error
    raise TripyException(msg) from None
tripy.common.exception.TripyException: 

--> /tripy/debugging_gather.py:40 in <module>()
      |
   40 | print(data)
      | 

MTRTException: InternalError: failed to run compilation on module with symbol name: outs_t3_1

Additional context:
Traceback (most recent call last):
  File "/tripy/tripy/backend/mlir/compiler.py", line 100, in compile
    executable = compiler.compiler_stablehlo_to_executable(
mlir_tensorrt.runtime._mlir_libs._api.MTRTException: InternalError: failed to run compilation on module with symbol name: outs_t3_1
.
    Loaded TensorRT version 10.1.0.27 but compiled for TensorRT 10.2.0.19. This can result in crashes or unintended behavior.
    (t3)error: op: %2 = "stablehlo.convert"(%1) : (tensor<3x3x2xf32>) -> tensor<3x3x2xi64> from function main is invalid, post clustering.

    This error occured while trying to compile the following FlatIR expression:
          |
          | t3: [rank=(3), shape=((-1, -1, -1)), dtype=(int64), loc=(gpu:0)] = ConvertOp(t2)
          | 

    Note: This originated from the following expression:

    --> /tripy/debugging_gather.py:39 in <module>()
          |
       39 | data = tp.cast(data, dtype=tp.int64)
          |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    Input 0:

    --> /tripy/tripy/frontend/trace/ops/iota.py:111 in iota()
          |
      111 |     return iota_impl(shape, dim, dtype, output_rank)
          | 

    --> /tripy/debugging_gather.py:37 in <module>()
          |
       37 | data = tp.iota((3, 3, 2))
          |        ^^^^^^^^^^^^^^^^^^ --- required from here

Here is the MLIR:

module @outs_t3_1 {
  func.func @main() -> tensor<?x?x?xi64> {
    %c = stablehlo.constant dense<[3, 3, 2]> : tensor<3xi32>
    %0 = stablehlo.dynamic_iota %c, dim = 0 : (tensor<3xi32>) -> tensor<?x?x?xf32>
    %1 = stablehlo.convert %0 : (tensor<?x?x?xf32>) -> tensor<?x?x?xi64>
    return %1 : tensor<?x?x?xi64>
  }
}

From what I can tell this seems to be a MLIR-TRT bug.

If you eval before casting then it passes:

data = tp.iota((3, 3, 2))
data.eval()
data = tp.cast(data, dtype=tp.int64)
print(data)
christopherbate commented 1 month ago

The TensorRT dialect doesn't have coverage of i64 yet. Right now it's not a priority. If you need it, then feel free to exapnd I64 support in the TensorRT operator ODS spec everywhere that is relevant.

Once that is done, we would need to update StableHlo-to-TensorRT. @shelkesagar29 this is another area where we need appropriate use of the target TensorRT version in the conversion pass. I64 should only be allowed as a target datatype for the TensorRT versions that support I64.