BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Apache License 2.0
433
stars
71
forks
source link
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF #5
#################### Running RotateTestCase ####################
test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... /home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:15: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than tensor.new_tensor(sourceTensor).
center[0] -= center[0].new_tensor(ow 0.5)
/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:16: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than tensor.new_tensor(sourceTensor).
center[1] -= center[1].new_tensor(oh 0.5)
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
FAIL
test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
Warning: Unsupported operator RotateTRT. No schema registered for this operator.
ERROR
test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... ERROR
test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... ERROR
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 87, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 85, in pth2trt
engine = build_engine(f, fp16=fp16)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 67, in build_engine
engine = runtime.deserialize_cuda_engine(plan)
TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported:
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(trace_inputs))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(input, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1098, in _slow_forward
result = self.forward(*input, kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 32, in forward
output = self.module(*inputs, **self.kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 116, in rotate
return _rotate(img, angle, center, _MODE[interpolation])
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 66, in forward
img = torch.grid_sampler(img, grid, interpolation, 0, False)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.02 GiB already allocated; 1.38 GiB free; 8.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(trace_inputs))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(input, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1098, in _slow_forward
result = self.forward(*input, kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 32, in forward
output = self.module(*inputs, **self.kwargs)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 116, in rotate
return _rotate(img, angle, center, _MODE[interpolation])
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 66, in forward
img = torch.grid_sampler(img, grid, interpolation, 0, False)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.02 GiB already allocated; 1.38 GiB free; 8.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 87, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, *kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 129, in setUp
BaseTestCase.init(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 33, in init
self.createInputs()
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 64, in createInputs
self.inputs_pth_fp16 = {key: val.half() for key, val in inputs_pth.items()}
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 64, in
self.inputs_pth_fp16 = {key: val.half() for key, val in inputs_pth.items()}
RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, *kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp
self.buildEngine(opset_version=13)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine
engine = pth2trt(
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, *kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 83, in test_fp32
self.fp32_case()
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 72, in fp32_case
self.assertLessEqual(cost, delta)
AssertionError: 0.0036153677 not less than or equal to 1e-05
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 157, in test_fp32
self.fp32_case()
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 146, in fp32_case
self.assertLessEqual(cost, delta)
AssertionError: 0.0036153677 not less than or equal to 1e-05
Traceback (most recent call last):
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 91, in test_fp16_bilinear
self.fp16_case(0.01)
File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 82, in fp16_case
self.assertLessEqual(cost, delta)
AssertionError: 0.258 not less than or equal to 0.01
#################### Running RotateTestCase #################### test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... /home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:15: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than tensor.new_tensor(sourceTensor). center[0] -= center[0].new_tensor(ow 0.5) /home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py:16: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than tensor.new_tensor(sourceTensor). center[1] -= center[1].new_tensor(oh 0.5) Warning: Unsupported operator RotateTRT. No schema registered for this operator. Warning: Unsupported operator RotateTRT. No schema registered for this operator. Warning: Unsupported operator RotateTRT. No schema registered for this operator. FAIL test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... Warning: Unsupported operator RotateTRT. No schema registered for this operator. Warning: Unsupported operator RotateTRT. No schema registered for this operator. Warning: Unsupported operator RotateTRT. No schema registered for this operator. ERROR test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... ERROR test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase) ... ERROR
#################### Running RotateTestCase2 #################### test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2) ... ERROR
====================================================================== ERROR: test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp self.buildEngine(opset_version=13) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 87, in buildEngine engine = pth2trt( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 85, in pth2trt engine = build_engine(f, fp16=fp16) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 67, in build_engine engine = runtime.deserialize_cuda_engine(plan) TypeError: deserialize_cuda_engine(): incompatible function arguments. The following argument types are supported:
Invoked with: <tensorrt.tensorrt.Runtime object at 0x7f527eff8df0>, None
====================================================================== ERROR: test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp self.buildEngine(opset_version=13) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine engine = pth2trt( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt torch.onnx.export( File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export return utils.export(model, args, f, export_params, verbose, training, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export _model_to_graph(model, args, verbose, input_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper outs.append(self.inner(trace_inputs)) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1098, in _slow_forward result = self.forward(*input, kwargs) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 32, in forward output = self.module(*inputs, **self.kwargs) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 116, in rotate return _rotate(img, angle, center, _MODE[interpolation]) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 66, in forward img = torch.grid_sampler(img, grid, interpolation, 0, False) RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.02 GiB already allocated; 1.38 GiB free; 8.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
====================================================================== ERROR: test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 57, in setUp self.buildEngine(opset_version=13) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine engine = pth2trt( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt torch.onnx.export( File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export return utils.export(model, args, f, export_params, verbose, training, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export _model_to_graph(model, args, verbose, input_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper outs.append(self.inner(trace_inputs)) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1098, in _slow_forward result = self.forward(*input, kwargs) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 32, in forward output = self.module(*inputs, **self.kwargs) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 116, in rotate return _rotate(img, angle, center, _MODE[interpolation]) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/functions/rotate.py", line 66, in forward img = torch.grid_sampler(img, grid, interpolation, 0, False) RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.02 GiB already allocated; 1.38 GiB free; 8.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
====================================================================== ERROR: test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp self.buildEngine(opset_version=13) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 87, in buildEngine engine = pth2trt( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt torch.onnx.export( File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export return utils.export(model, args, f, export_params, verbose, training, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export _model_to_graph(model, args, verbose, input_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, *kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper tuple(x.clone(memory_format=torch.preserve_format) for x in args) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
====================================================================== ERROR: test_fp16_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 129, in setUp BaseTestCase.init( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 33, in init self.createInputs() File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 64, in createInputs self.inputs_pth_fp16 = {key: val.half() for key, val in inputs_pth.items()} File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 64, in
self.inputs_pth_fp16 = {key: val.half() for key, val in inputs_pth.items()}
RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
====================================================================== ERROR: test_fp32_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp self.buildEngine(opset_version=13) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine engine = pth2trt( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt torch.onnx.export( File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export return utils.export(model, args, f, export_params, verbose, training, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export _model_to_graph(model, args, verbose, input_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, *kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper tuple(x.clone(memory_format=torch.preserve_format) for x in args) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
====================================================================== ERROR: test_fp32_nearest (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase2)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 142, in setUp self.buildEngine(opset_version=13) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/base_test_case.py", line 95, in buildEngine engine = pth2trt( File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/utils.py", line 75, in pth2trt torch.onnx.export( File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 305, in export return utils.export(model, args, f, export_params, verbose, training, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 118, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 719, in _export _model_to_graph(model, args, verbose, input_names, File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 499, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 440, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 391, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, *kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in wrapper tuple(x.clone(memory_format=torch.preserve_format) for x in args) File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 11.77 GiB total capacity; 8.90 GiB already allocated; 509.19 MiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
====================================================================== FAIL: test_fp32 (det2trt.models.utils.test_trt_ops.test_modulated_deformable_conv2d.ModulatedDeformableConv2dTestCase)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 83, in test_fp32 self.fp32_case() File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 72, in fp32_case self.assertLessEqual(cost, delta) AssertionError: 0.0036153677 not less than or equal to 1e-05
====================================================================== FAIL: test_fp32 (det2trt.models.utils.test_trt_ops.test_modulated_deformable_conv2d.ModulatedDeformableConv2dTestCase2)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 157, in test_fp32 self.fp32_case() File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_modulated_deformable_conv2d.py", line 146, in fp32_case self.assertLessEqual(cost, delta) AssertionError: 0.0036153677 not less than or equal to 1e-05
====================================================================== FAIL: test_fp16_bilinear (det2trt.models.utils.test_trt_ops.test_rotate.RotateTestCase)
Traceback (most recent call last): File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 91, in test_fp16_bilinear self.fp16_case(0.01) File "/home/wyh/BEVFormer_tensorrt/./det2trt/models/utils/test_trt_ops/test_rotate.py", line 82, in fp16_case self.assertLessEqual(cost, delta) AssertionError: 0.258 not less than or equal to 0.01
Ran 136 tests in 381.644s
FAILED (failures=3, errors=7)
Running RotateTestCase时报错显存不够