Open SayHiRay opened 4 years ago
Would really appreciate it if you could take a look at this. Thanks a ton for your time! @KellenSunderland @marcoabreu @Caenorst
It seems the onnx-trt not setting the device properly. @SayHiRay If you accept a dirty workaround try replace ctx = mx.gpu(1) with these:
from ctypes import cdll, c_char_p
libcudart = cdll.LoadLibrary('libcudart.so')
libcudart.cudaGetErrorString.restype = c_char_p
def cudaSetDevice(device_idx):
ret = libcudart.cudaSetDevice(device_idx)
if ret != 0:
error_string = libcudart.cudaGetErrorString(ret)
raise RuntimeError("cudaSetDevice: " + error_string)
device_id = 1
cudaSetDevice(device_id)
ctx = mx.gpu(device_id)
@Caenorst Looks like the TRTCreateState function in src/operator/subgraph/tensorrt/tensorrt.cc file not passing the ctx to the TRT. Add
CUDA_CALL(cudaSetDevice(ctx.dev_id));
to the top of this function will fix this issue, but not sure it is a good fix.
It seems the onnx-trt not setting the device properly. @SayHiRay If you accept a dirty workaround try replace ctx = mx.gpu(1) with these:
from ctypes import cdll, c_char_p libcudart = cdll.LoadLibrary('libcudart.so') libcudart.cudaGetErrorString.restype = c_char_p def cudaSetDevice(device_idx): ret = libcudart.cudaSetDevice(device_idx) if ret != 0: error_string = libcudart.cudaGetErrorString(ret) raise RuntimeError("cudaSetDevice: " + error_string) device_id = 1 cudaSetDevice(device_id) ctx = mx.gpu(device_id)
@TristonC Thanks for the workaround. It works great!
Description
I followed this official tutorial to perform inference with TensorRT. It works fine when I bind the model on GPU0. However, it reports
engine.cpp (212) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR)
error when I run the model on GPU1. Although an output is still given after the inference is done, the value of the output consists of all zeros and is different from the one run on GPU0.Error Message
[2020-03-17 12:36:39 ERROR] engine.cpp (212) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR) [2020-03-17 12:36:39 ERROR] engine.cpp (212) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR)
To Reproduce
In my case I can use the example below to reproduce the error:
What have you tried to solve it?
I tried to check GPU usage using nvidia-smi when running the python script above. It seems that Both GPU1 and GPU0 are used during the process. Looks like some operators are still allocated in GPU0 (especially the
TensorRT0
Op).Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below: