Need a tiny bit of help to use custom caffemodel with detectNet

lweingart commented 1 year ago

Dear dusty-nv,

I'm not sure this falls under your duty but maybe you could still give me a hand ? I'm trying to use detectNet with a caffemodel and from your jetson_inference documentation here it seems I need to call detectNet with at least a model (path to .caffemodel file), a prototxt (path to prototxt file) and a label (path to label file) values. The doc says:

[detectNet](https://rawgit.com/dusty-nv/jetson-inference/dev/docs/html/python/jetson_inference.html#detectNet) arguments: 
  --network=NETWORK     pre-trained model to load, one of the following:
                            * ssd-mobilenet-v1
                            * ssd-mobilenet-v2 (default)
                            * ssd-inception-v2
                            * peoplenet
                            * peoplenet-pruned
                            * dashcamnet
                            * trafficcamnet
                            * facedetect
  --model=MODEL         path to custom model to load (caffemodel, uff, or onnx)
  --prototxt=PROTOTXT   path to custom prototxt to load (for .caffemodel only)
  --labels=LABELS       path to text file containing the labels for each class
  --input-blob=INPUT    name of the input layer (default is 'data')
  --output-cvg=COVERAGE name of the coverage/confidence output layer (default is 'coverage')
  --output-bbox=BOXES   name of the bounding output layer (default is 'bboxes')
  --mean-pixel=PIXEL    mean pixel value to subtract from input (default is 0.0)
  --confidence=CONF     minimum confidence threshold for detection (default is 0.5)
  --clustering=CLUSTER  minimum overlapping area threshold for clustering (default is 0.75)
  --alpha=ALPHA         overlay alpha blending value, range 0-255 (default: 120)
  --overlay=OVERLAY     detection overlay flags (e.g. --overlay=box,labels,conf)
                        valid combinations are:  'box', 'lines', 'labels', 'conf', 'none'
  --profile             enable layer profiling in TensorRT

I'm trying to reuse the model from the configuration I had from a deepstream app that goes like this:

model-file=../../../../samples/models/Primary_Detector/resnet10.caffemodel
proto-file=../../../../samples/models/Primary_Detector/resnet10.prototxt
model-engine-file=../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
labelfile-path=../../../../samples/models/Primary_Detector/labels.txt
int8-calib-file=../../../../samples/models/Primary_Detector/cal_trt.bin

by doing this:

MODEL = "/opt/nvidia/deepstream/deepstream-6.2/samples/models/Primary_Detector/resnet10.caffemodel"
PROTOTXT = "/opt/nvidia/deepstream/deepstream-6.2/samples/models/Primary_Detector/resnet10.prototxt"
LABELS = "/opt/nvidia/deepstream/deepstream-6.2/samples/models/Primary_Detector/labels.txt"
net = jetson_inference.detectNet(model=MODEL, prototxt=PROTOTXT, labels=LABELS)

but I get this error:

Traceback (most recent call last):
  File "test.py", line 70, in <module>
    net = jetson_inference.detectNet(model=MODEL, prototxt=PROTOTXT, labels=LABELS)
TypeError: 'prototxt' is an invalid keyword argument for this function

In addition, if I simply try the existing detectNet.py from your repo with this:

python3 detectnet.py --input file:///home/jetson/git/droning/QG/Parc-07.mp4 --output test.mp4 --model /opt/nvidia/deepstream/deepstream-6.2/samples/models/Primary_Detector/resnet10.caffemodel --prototxt /opt/nvidia/deepstream/deepstream-6.2/samples/models/Primary_Detector/resnet10.prototxt --labels /opt/nvidia/deepstream/deepstream-6.2/samples/models/Primary_Detector/labels.txt

I get:

[TRT]    couldn't find built-in detection model ''
Traceback (most recent call last):
  File "detectnet.py", line 53, in <module>
    net = detectNet(args.network, sys.argv, args.threshold)
Exception: jetson.inference -- detectNet failed to load network

which I find strange, as according to my understanding, I should only provide the --network value if I use one of the pretrained model to load from the list, right?

So, I actually got two questions here :-)

How do I pass the path to the prototxt file? Do I leave the path to the .engine unused?

Thank you very much in advance for your esteemed help :-)

Cheers

dusty-nv commented 1 year ago

Hi @lweingart, it has been a long time since I've used new caffemodels (for obvious reasons - nobody really uses caffe anymore), however if this model is from here:

https://github.com/RidgeRun/gtc-2020-demo/blob/master/deepstream-models/Primary_Detector

I'm doubtful that it would work anyways because this is a different DNN architecture and would require additional support in the pre/post-processing.

Instead I just recommend that you use one of the much more recent TAO PeopleNet, DashCamNet, or TrafficCamNet models which has similar classes supported:

https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-tao.md

Good luck, and hope that helps!

lweingart commented 1 year ago

Hi @dusty-nv, thank you very much for getting back to me.

Let me explain a bit more why I wanted to do this. I have a thermal camera mounted on a gimbal, and I need to run human detection on the video feed. I tried all readily available models from jetson-inference (ssd-mobilenet-v2, ssd-inception-v2, peoplenet-pruned, dashcamnet, trafficcamnet, etc) and I tried also the deepstream python app, and unfortunately the only one that could properly detect humans on my video feed was the one from the deepstream app.

In second comes the ssd-inception-v2, but it is quite far from the first and unfortunately not enough for my need.

Anyway, I understand that I will have to make do with it, as it's impossible for me to go back to deepstream after having used jetson-inference.

Let me keep this issue open for now, I'll get back to it later.

Thank you again very much for your time and help, I'm sincerely grateful

Cheers

lweingart commented 1 year ago

Hi again @dusty-nv,

If you are curious, have a look at the following videos taken of myself walking in a parc. There is the original taken by the thermal camera on the gimbal, that has some object following features. Then the one with ssd-inception-v2, I draw a black bounding box when there is a detection, and the third video is the result taken with the resnet. The difference on these images is obvious.

However, I just realised while writing these lines that although nobody uses caffe anymore, maybe I could use another version of the model (onnx for instance) that could be used by Detectnet ?

What kind of implementation is the easiest to integrate with detectnet, would you have a recommendation ?

Here are the videos: Original: https://youtu.be/4VKHMAxn6EI

ssd-inception: https://youtu.be/9uM5CPb8BVU

resnet: https://youtu.be/oIk6MOmtvXw

lweingart commented 1 year ago

Hi again @dusty-nv,

I realised I still have an issue when trying to use a custom model with detectnet. When I specify a --model path to a model and run the code, I get this error:

[TRT]    couldn't find built-in detection model ''
Traceback (most recent call last):
  File "detectnet.py", line 53, in <module>
    net = detectNet(args.network, sys.argv, args.threshold)
Exception: jetson.inference -- detectNet failed to load network

(I'm not trying the caffe anymore, but a torch model .pth)

and I think it's because I don't specifiy a value for --network.

However, I thought that this parameter was only for the pretrained models to load.

What value should I use then for the --network parameter ?

EDIT: About the --network error I thought I had, it actually came from the fact that we need to path arguments like:--model=path/to/model instead of --model path/to/model

I managed to try a deepstream app with a yolo as model and it worked even better on my thermal video. So now my goal is to use jeston-inference detectnet using a yolo. Is there a simple way to do that ?

EDIT2: I found a yolov7.onnx and tried to use it with detectnet. For some time I was hopefull as it was doing the kind of optimisation it's usually doing when load a model for the first time, but then it ended in error:

[TRT]
[TRT]    3: Cannot find binding of given name: data
[TRT]    failed to find requested input layer data in network
[TRT]    device GPU, failed to create resources for CUDA engine
[TRT]    failed to create TensorRT engine for /home/jetson/git/droning/models/yolov7-w6.onnx, device GPU
[TRT]    detectNet -- failed to initialize.
Traceback (most recent call last):
  File "detectnet.py", line 53, in <module>
    net = detectNet(args.network, sys.argv, args.threshold)
Exception: jetson.inference -- detectNet failed to load network

I used the following command to run it:

python3 detectnet.py --input file:///home/jetson/git/droning/QG/Parc-06.mp4 --output test.mp4  --model=/home/jetson/git/droning/models/yolov7-w6.onnx  --labels=/home/jetson/git/droning/models/labels_yolo.txt

Any idea what could be wrong ?

EDIT3: I also tried with a brand new yolov8n.onnx but ended up with the exact same error after something like 10 minutes of optimisation due to loading a new network.

Thank you for your help, have a good week Cheers

dusty-nv commented 1 year ago

Sorry @lweingart, YOLO would require additional pre/post-processing support in jetson-inference/c/detectNet.cpp for the particular version of YOLO model you are using. Here are examples of using YOLOv8 with TensorRT:

https://wiki.seeedstudio.com/YOLOv8-TRT-Jetson/ https://github.com/triple-Mu/YOLOv8-TensorRT

You could still manage to use these with jetson_utils if you want so you get the camera I/O stuff, ect.

lweingart commented 1 year ago

Hello @dusty-nv, thank you for your reply, and sorry for the late one. Indeed, if I could enjoy jetson_utils and/or jetson_inference for all the camera and file IO that would be for the best.

When you say "for the particular version of YOLO you are using", do you want to say that it could work with a different YOLO version ?

I'm not attached to any specific version of YOLO honestly, as they all seem to work fine, so if you have one that works with detectNet, I would gladly give it a try.

Otherwise, I will indeed have to use your pipeline for image management and something else for human detection.

Cheers

dusty-nv / jetson-inference

Need a tiny bit of help to use custom caffemodel with detectNet #1732