isarsoft / yolov4-triton-tensorrt

This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server
http://www.isarsoft.com
Other
276 stars 63 forks source link

./rtSafe/safeContext.cpp (133) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR) #29

Closed Darshcg closed 3 years ago

Darshcg commented 3 years ago

Hi @isarsoft,

I have converted my Pytorch model to ONNX and to TRT(with Custom Plugins). But I am using another TRT(2) Model whose output will be input to the above TRT(1) Model. I am serving those two TRT models using Triton Server on Jetson Nano, I am sending the request from my Laptop to Jetson Nano, but during the response, I am getting error in the Jetson nano terminal as:

E0311 13:59:57.029723 29688 logging.cc:43] …/rtSafe/safeContext.cpp (133) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR) E0311 13:59:57.030261 29688 logging.cc:43] FAILED_EXECUTION: std::exception

Not sure what is going wrong. what is causing an issue here? Can you please assist me in resolving this error?

Command I used in Jetson Nano to start the serving of two models:

LD_PRELOAD=“Einsum_op.so RoI_Align.so libyolo_layer.so” ./Downloads/bin/tritonserver --model-repository=./models --min-supported-compute-capability=5.3 --log-verbose=1

Model Info:

Yolov3-spp-ultralytics. SlowFast.(with two Custom Plugins). The bounding boxes outputed by the 1st model will go as the input to the 2nd model(Slowfast).

Looking forward to the reply

Thanks, Darshan

philipp-schmidt commented 3 years ago

Hi, this looks like a very hard error to track down without more context. Obviously you should start checking whether cudnn is working correctly (with Jetpack?). And make sure all your plugins link against the same versions of tensorrt and cudnn as well (using ldd).

Darshcg commented 3 years ago

Hi @philipp-schmidt , Thanks for your reply.

I tried running Outside the Triton Server with TensorRT, it is working fine. And also, for the first frame of the video, it is giving an output. But for the next frame, it is a throwing error. Not sure what is going wrong.

Model Info:

  1. Yolov3-spp-ultralytics.
  2. SlowFast.(with two Custom Plugins). The bounding boxes outputed by 1st model will go as the input to the 2nd model(Slowfast).

As I said, for the first frame of the video, it is giving the output. But for the next frame, it is a throwing an error.

I am attaching the Plugin Code that i have Implemented below: Einsum Plugin Code: https://drive.google.com/drive/u/1/folders/1u5uLjdOXDMZeze4UtFsHevAQ410Pcq-t RoI Align Plugin Code: https://drive.google.com/drive/u/1/folders/1E59KChIAl6WiN7pmEN2QyUleql5kvRpS

Thanks, Darshan

philipp-schmidt commented 3 years ago

Are you running out of RAM maybe? Jetson Nano is very limited and a "mapping error" could be memory related. As the implementation is not from the repo we can really only speculate.

ciklista commented 3 years ago

@Darshcg I had the same problem working on a Jetson but I was using TensorFlow based models instead. Even though my code is in python, maybe there exists an equivalent for your C code. https://github.com/NVIDIA/TensorRT/issues/303#issuecomment-829053368