DerryHub / BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Apache License 2.0
432 stars 71 forks source link

does anyone try to deploy this great repo in jetson orin? Unsupported operator Gridsampler2DTRT #57

Open sainttelant opened 1 year ago

sainttelant commented 1 year ago

first of all, thanks for this great job, i 've deployed this project in x86 ubuntu system successfully, however, i encountered lots of problems when i tried to deploy it in jetson orin pack. for example, it showed Unsupported operator Gridsampler2DTRT during running test_trt_ops.py even i installed onnx 1.12.0, torch 1.12.0 in orin jetson pack. i noticed that someone has encountered similar problem as i showed, it was solved via installed onnx1.12 and torch 1.12 synchronously. but the difference is that i installed tensorrt8.5.2.2 in orin!!

i don't know whether the difference of tensorrt version would result in such issue, by the way, i also encountered nan_to_num plugin issue as well. thanks alot, if you can help me

DerryHub commented 1 year ago

It looks like the custom plugin dynamic link library is not loaded properly. Are you sure loaded BEVFormer_tensorrt/TensorRT/build/libtensorrt_ops.so?

sainttelant commented 1 year ago

@DerryHub thanks for your reply, i am sure that i 've already loaded *ops.so successfully judged via logger's printing, by the way, i have already addressed such issue, i substitude this nan_to_num function by using np.nan_to_num for original function in that of torch, and then it converted onnx into trt successfully, thanks a lot any way

Alex-fishred commented 8 months ago

nan_to_num

Hello, which side did you modify to solve the problem? I try to modify the encoder of det2trt, and the trt can be converted successfully, but the indicator result is incorrect.

sainttelant commented 8 months ago

@Alex-fishred it is really a long time issue i encountered, i don't remember where i modified in code exactly, but i have two options for you, 1, write the declaration of nan_to_num plugin in hpp and definition of nan_to_num plugin in CPP, and register it in "So" appropriately. 2, option2, substitute it with numpy version of nan_to_num instead of torch version of "nan_to_num". But it will result in the decrease of inference accuracy, you have to figure out a way to solve it.

Alex-fishred commented 8 months ago

Thank you for your reply, thank you very much. I have tried np's nan_to_num, and it can indeed be successfully inferred but the accuracy chance is 0. Do you have any suggestions on how to correct the accuracy? I probably won’t implement option 1 because I don’t know C++ very well.

sainttelant commented 8 months ago

Thank you for your reply, thank you very much. I have tried np's nan_to_num, and it can indeed be successfully inferred but the accuracy chance is 0. Do you have any suggestions on how to correct the accuracy? I probably won’t implement option 1 because I don’t know C++ very well.

i also have no idea about how to increase the accuracy, sorry about that, it seems that it is probably helpful to do post-nms, or filter results by tuning threshold of confidence.... Do you have any progress about this issue?

vietnguyen012 commented 4 months ago

Hello, @sainttelant what do you mean by installing onnx1.12 and torch 1.12 synchronously? I also have the same problem unsupported operators even when I have built custom plugins, mmdeploy.

sainttelant commented 4 months ago

@vietnguyen012 , you have to install the corresponding versions onnx1.12 and torch1.12 in your Jetson devices, otherwise you will encounter many incompatible ops.

vietnguyen012 commented 4 months ago

@sainttelant have you converted successfully with custom plugins? can you share with me your tensorrt, onnx, torch versions on jetson orin? also the orin version, too:). Thank you very much!

sainttelant commented 3 months ago

@vietnguyen012 actually, i 've written my own custom plugins, and i am sure i 've registered the plugins there, of course, it converted successfully, however, it still couldn't find my plugins when i executed the instances of the functions, it is weird, finally, i replaced an ops called num_to_nan as numply function, it can execute the inference at last, however the accuracy decreased too much. i will show you my docker image here, you can refer to it. https://hub.docker.com/repository/docker/sainttelant/bevformer_xw_tensorrt/general