Open orioninthesky98 opened 1 month ago
any updates on this?
@orioninthesky98 I have tried the example in the current latest main and our upcoming 2.5.0 release, both are working as expected. I think the batchnorm3d bug has been fixed.
Also in terms of the use_fast_partitioner=False, bug: https://github.com/pytorch/TensorRT/issues/3157 PR is raised and will be merged into main and 2.5.0 release.
Bug Description
I can't compile this model and the error seems to be caused by
nn.BatchNorm3d
To Reproduce
Steps to reproduce the behavior:
placeholder_batch = torch.rand((batch_size,) + tuple(network_input_shape)) placeholder_batch = placeholder_batch.to("cuda")
compiled_model = trt.compile( conv_block, inputs=[placeholder_batch], enabled_precisions={torch.float32}, optimization_level=5, # max is 5, compilation takes longer but gives the best speedup debug=True, # very verbose, only turn on if needed use_fast_partitioner=True, # cant disable, results in error when exporting dynamic=False, disable_tf32=True, # reduce precision errors at the expense of small slowdown )
DEBUG:torch_tensorrt.dynamo.conversion._TRTInterpreter:Converting node /cc2_norm/_native_batch_norm_legit_no_training_1 (kind: aten._native_batch_norm_legit_no_training.default, args: ('[CONVOLUTION]-[aten_ops.convolution.default]-[/cc2_conv/convolution_1]_output <tensorrt.ITensor [shape=(81), dtype=DataType.FLOAT]>', '<torch.Tensor as np.ndarray [shape=(16,), dtype=float32]>', '<torch.Tensor as np.ndarray [shape=(16,), dtype=float32]>', '<torch.Tensor as np.ndarray [shape=(16,), dtype=float32]>', '<torch.Tensor as np.ndarray [shape=(16,), dtype=float32]>', 0.1, 1e-05)) ERROR:torch_tensorrt [TensorRT Conversion Context]:ITensor::getDimensions: Error Code 4: Internal Error (Output shape can not be computed for node [CONVOLUTION]-[aten_ops.convolution.default]-[/cc2_conv/convolution_1].)
File ~/.conda/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/aten_ops_converters.py:111, in aten_ops_batch_norm_legit_no_training(ctx, target, args, kwargs, name) 99 @dynamo_tensorrt_converter( 100 torch.ops.aten._native_batch_norm_legit_no_training.default, 101 capability_validator=one_user_validator, (...) 109 name: str, 110 ) -> Union[TRTTensor, Sequence[TRTTensor]]: --> 111 return impl.normalization.batch_norm( 112 ctx, 113 target, 114 SourceIR.ATEN, 115 name, 116 input=args[0], 117 weight=args[1], 118 bias=args[2], 119 running_mean=args[3], 120 running_var=args[4], 121 training=False, 122 momentum=args[5], 123 eps=args[6], 124 cudnn_enabled=False, 125 return_mean_rstd=( 126 target == torch.ops.aten._native_batch_norm_legit_no_training.default 127 ), 128 )
File ~/.conda/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/impl/normalization/ops.py:65, in batch_norm(ctx, target, source_ir, name, input, weight, bias, running_mean, running_var, training, momentum, eps, cudnn_enabled, return_mean_rstd) 63 # For BatchNorm1d, reshape 1d to 2d 64 output_shape = input.shape ---> 65 if len(input.shape) < 4: 66 assert ( 67 len(get_dynamic_dims(input.shape)) <= 1 68 ), "BatchNorm1D with more than one dynamic dims is not currently supported." 69 new_shape = ( 70 (input.shape[0], input.shape[1], 1, 1) 71 if len(input.shape) == 2 72 else (input.shape[0], input.shape[1], input.shape[2], 1) 73 )
ValueError: len() should return >= 0
While executing %_native_batch_norm_legit_no_training_1 : [num_users=1] = call_function[target=torch.ops.aten._native_batch_norm_legit_no_training.default](args = (%convolution_1, %cc2_norm_weight, %cc2_norm_bias, %cc2_norm_running_mean, %cc2_norm_running_var, 0.1, 1e-05), kwargs = {_itensor_to_tensor_meta: {<tensorrt_bindings.tensorrt.ITensor object at 0x7f982e80c6b0>: ((128, 1, 1, 1, 32), torch.float32, False, (32, 32, 32, 32, 1), torch.contiguous_format, False, {}), <tensorrt_bindings.tensorrt.ITensor object at 0x7f982e8175f0>: ((128, 1, 1, 1, 32), torch.float32, False, (32, 32, 32, 32, 1), torch.contiguous_format, False, {}), <tensorrt_bindings.tensorrt.ITensor object at 0x7f982e528eb0>: ((128, 16, 1, 1, 32), torch.float32, False, (512, 32, 32, 32, 1), torch.contiguous_format, False, {}), <tensorrt_bindings.tensorrt.ITensor object at 0x7f982e938c70>: ((128, 16, 1, 1, 32), torch.float32, False, (512, 32, 32, 32, 1), torch.contiguous_format, False, {}), <tensorrt_bindings.tensorrt.ITensor object at 0x7f982e7ce770>: ((128, 16, 1, 1, 32), torch.float32, False, (512, 32, 32, 32, 1), torch.contiguous_format, False, {}), <tensorrt_bindings.tensorrt.ITensor object at 0x7f982e96ef30>: ((128, 16, 1, 1, 32), torch.float32, False, (512, 32, 32, 32, 1), torch.contiguous_format, False, {})}}) Original traceback: from loguru import logger return forward_call(*args, **kwargs) h = self.norm(h)
log: https://gist.github.com/orioninthesky98/96612bfd59e35344182de44d9a303aa7
related bug: if I try to set
use_fast_partitioner=False
, the model actually compiles fine, but I get this error at the very end and the script crashes, https://gist.github.com/orioninthesky98/a784c361ebbdfa9000564b3f8a1ac1c0) somebody already filed this bug: https://github.com/pytorch/TensorRT/issues/3157