NVIDIA / retinanet-examples

Fast and accurate object detection with end-to-end GPU optimization
BSD 3-Clause "New" or "Revised" License
885 stars 271 forks source link

Converting models available in README of this repo to Engine File throws error #302

Closed SubhankarHalder closed 3 years ago

SubhankarHalder commented 3 years ago

Hello,

  1. I downloaded the Resnet 18 and Resnet 34 pth models available in the README.md of this Repo
  2. I went through the list of commands to start the ODTK docker, namely:
git clone https://github.com/nvidia/retinanet-examples
docker build -t odtk:latest retinanet-examples/
docker run --gpus all --rm --ipc=host -it -v /home/vast/retinanet:/workspace/model odtk:latest
  1. Then ran the following command inside the docker in the /workspace/model directory:
odtk export retinanet_rn18fpn.pth engine.plan
  1. However, I got an error:

    Loading model from retinanet_rn18fpn.pth...
     model: RetinaNet
    backbone: ResNet18FPN
    classes: 80, anchors: 9
    Exporting to ONNX...
    Building FP16 core model...
    Segmentation fault (core dumped)
  2. When running nvcc -V command:

    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2020 NVIDIA Corporation
    Built on Mon_Nov_30_19:08:53_PST_2020
    Cuda compilation tools, release 11.2, V11.2.67
    Build cuda_11.2.r11.2/compiler.29373293_0
  3. With this command apt list | grep nvinfer

    libnvinfer-bin/now 7.2.2-1+cuda11.1 amd64 [installed,local]
    libnvinfer-dev/now 7.2.2-1+cuda11.1 amd64 [installed,local]
    libnvinfer-plugin-dev/now 7.2.2-1+cuda11.1 amd64 [installed,local]
    libnvinfer-plugin7/now 7.2.2-1+cuda11.1 amd64 [installed,local]
    libnvinfer7/now 7.2.2-1+cuda11.1 amd64 [installed,local]

It seems linvinfer is working on cuda 11.1 but nvcc is 11.2 Could that be the reason? How do I fix this?

ghost commented 3 years ago

Can you change this line to 21.06, and try again?

SubhankarHalder commented 3 years ago

Thanks a lot! Your idea worked!