Open WYYAHYT opened 1 year ago
More details result for onnx,check_model:
The model is invalid: No Op registered for RotateTRT with domain_version of 13
==> Context: Bad node spec for node. Name: OpType: RotateTRT
and
The model is invalid: No Op registered for ModulatedDeformableConv2dTRT with domain_version of 13
==> Context: Bad node spec for node. Name: OpType: ModulatedDeformableConv2dTRT
It seems to be the error of ONNX operator, should I add support for these operators?
A segmentation fault is typically caused by a program trying to read from or write to an illegal memory location, that is, part of the memory to which the program is not supposed to have access. Please check your memory usage status when onnx2trt. This issue has nothing to do with
More details result for onnx,check_model:
The model is invalid: No Op registered for RotateTRT with domain_version of 13 ==> Context: Bad node spec for node. Name: OpType: RotateTRT
and
The model is invalid: No Op registered for ModulatedDeformableConv2dTRT with domain_version of 13 ==> Context: Bad node spec for node. Name: OpType: ModulatedDeformableConv2dTRT
It seems to be the error of ONNX operator, should I add support for these operators?
We are having the same issue, have you found a solution @WYYAHYT ?
I'm also having the same issue as @WYYAHYT and @serwansj . I got the following error when trying to optimize the ONNX graph produced from the custom plugins with the onnxsim tool:
onnx.onnx_cpp2py_export.checker.ValidationError: No Op registered for RotateTRT with domain_version of 13
==> Context: Bad node spec for node. Name: RotateTRT_263 OpType: RotateTRT
Have you find a solution for this?
Sorry to say that I don't have enough time to solve this problem, and really hope that you guys can solve it. @serwansj @matcosta23
Similar problem (Segmentation Fault) met when running sh samples/bevformer/plugin/small/onnx2trt_fp16_2.sh -d 0
. Any help is appreciated!
My problem is caused by not using TensorRT 8.4.1 instead of the recommended TensorRT 8.5 version. In NVIDIA's official release note of tensorrt, you can find that 8.4.1 version has a known issue which has been fixed in TensorRT 8.4.3: When parsing networks with ONNX operand expand on scalar input. TensorRT would error out. This issue has been fixed in this release.
I believe using TensorRT 8.5.1.7, as recommended by the author of this repo, and the command would go normal.
However, as I am currently using a Jetson Orin which only has TensorRT 8.5.2 pakage (which does not has this "Expand Operator Error" but has other errors), I cannot try 8.5.1.7.
Anyway, I believe this error is caused by incompatible TensorRT version. You can try it up.
Similar problem (Segmentation Fault) met when running
sh samples/bevformer/plugin/small/onnx2trt_fp16_2.sh -d 0
. Any help is appreciated!
The same issue as @WYYAHYT , @serwansj , @matcosta23 and @sun-lingyu , I'm using a Drive Orin Devkit with TensorRT 8.4.11, while with 8.5.3 on a x86 machine, this error go away. I believe it's caused by incompatible version, however I can not update the packages by now. Any suggestion walking around this problem?
Just as @sun-lingyu said, It's caused by the issue: When parsing networks with ONNX operand expand on scalar input. TensorRT would error out. This issue has been fixed in this release.
To be precise, it's caused by this line https://github.com/DerryHub/BEVFormer_tensorrt/blob/303d3140c14016047c07f9db73312af364f0dd7c/det2trt/models/modules/transformer.py#L313
So I modify the code to TRT counterpart as:
feat_flatten = []
spatial_shapes = []
for lvl, feat in enumerate(mlvl_feats):
bs, num_cam, c, h, w = feat.shape
spatial_shape = (h, w)
feat = feat.flatten(3).permute(1, 0, 3, 2)
if self.use_cams_embeds:
feat = feat + self.cams_embeds[:, None, None, :].to(feat.dtype)
feat = feat + self.level_embeds[None,
None, lvl:lvl + 1, :].to(feat.dtype)
spatial_shapes.append(spatial_shape)
feat_flatten.append(feat)
spatial_shapes = torch.as_tensor(
spatial_shapes, dtype=torch.long, device=bev_pos.device
)
level_start_index = torch.cat(
(spatial_shapes.new_zeros((1,)), spatial_shapes.prod(1).cumsum(0)[:-1])
)
for i in range(len(feat_flatten)):
feat_flatten[i] = feat_flatten[i].permute(0, 2, 1, 3)
feat_flatten = torch.stack(feat_flatten)
bev_embed = self.encoder.forward_trt(
bev_queries,
feat_flatten,
feat_flatten,
lidar2img=lidar2img,
bev_h=bev_h,
bev_w=bev_w,
bev_pos=bev_pos,
spatial_shapes=spatial_shapes,
level_start_index=level_start_index,
prev_bev=prev_bev,
shift=shift,
image_shape=image_shape,
use_prev_bev=use_prev_bev,
)
return bev_embed
It works for me. FYI @serwansj @matcosta23 @WYYAHYT
Command:
python tools/bevformer/onnx2trt.py configs/bevformer/plugin/bevformer_tiny_trt_p.py checkpoints/onnx/bevformer_tiny_epoch_24_cp.onnx
Error (Catch by faulthandler):
And
File "/data/projects/bevformer_tensorrt/BEVFormer_tensorrt/./det2trt/convert/onnx2tensorrt.py", line 38 in build_engine
is the code for parsing onnx model, then I tried to regenerate onnx model(successfully) and convert to trt, but failed again.Maybe the onnx model is not correct? So I checked with
onnx.checker.check_model(onnx_model)
, Incorrect indeedBut there was no error(only warnings) occured in the process of pth2onnx with nv_half or nv_half2, Warning below with command:
python tools/pth2onnx.py configs/bevformer/plugin/bevformer_tiny_trt_p.py checkpoints/pytorch/bevformer_tiny_epoch_24.pth --opset_version 13 --cuda --flag cp
And here is the ONNX model generated
Rellay hope someone could help!