Segfault inside docker container

ria-com / nomeroff-net

Nomeroff Net. Automatic numberplate recognition system.

GNU General Public License v3.0

457 stars 159 forks source link

Segfault inside docker container #207

Open pslarionov opened 2 years ago

pslarionov commented 2 years ago

I am constantly getting segfault inside a docker container. I tried to use demos and restapi examples and got no luck. There are no errors if image have no license plate.

There are no errors if I try to use examples on my host OS.

I am not sure what can help you in this situation but I have: OS: Ubuntu 20.04 GPU: 2080 RTX TI/ 2070 RTX Driver: 470.82.01 CUDA: 11.4 Cuda compilation tools, release 11.4, V11.4.152 Build cuda_11.4.r11.4/compiler.30521435_0 TensorRT: 8.2.1.8 GA

dimabendera commented 2 years ago

Are you using tensorrt models? Can you debug where the segfault occurs?

pslarionov commented 2 years ago

Yes, I am using tensorrt models.

Segfault (for examples/py/get-started-demo.py with a small fix from #201) occurs at lines 43-45:

all_points = npPointsCraft.detect(img,
                                  targetBoxes,
                                  [5, 2, 0])

dimabendera commented 2 years ago

Firstly, I want to note that we have not tested the performance of the code on TensorRT8. We had a Segfault issue on a Jetson Xavier device where the OPENBLAS_CORETYPE=ARMV8 flag helped. It also matters in what order you initialize the torch, onnx and tensorrt models. Try the same as here. import pycuda.autoinit might also help you.

Dmytro-Shvetsov commented 2 years ago

Facing the same issue with TensorRT 8, however segfault appears when the models get destroyed by garbage collection. @pslarionov have you managed to get the models running on TensorRT 8? @dimabendera could you share cuda/cudnn/tensorrt versions that worked flawlessly for you when converting and inferencing current yolov5 and ocr models?