rafaelbate commented 2 years ago

Hello, I am trying to use FastMOT to detect custom classes but there in no detection or tracking output when I run the app.py script, it simply gives me the same input video but resized and with the text "visible: 0" on the top left corner.

I run:

python3 app.py --input_uri custom/v1_second_tree.mp4 --mot --output_uri results.mp4

And get output:

I've read issue #30 and I have Driver Version: 471.11 (Using Ubuntu 20.04) and I've also tried to assign "computes=52" in the makefile as I am using a GTX 970, which has a compute capability of 5.2.

It is important to point out that when I run "docker run ... ", a message appears:

I think the gpu is available within the docker container, because when I execute the command "nvidia-smi" and "/usr/local/cuda/bin/nvcc --version", it outputs this:

And also when I run app.py, I can see that the GPU resources are being used as well as its temperature rising.

Regarding the setup to track custom classes, I simply followed the guide:

Successfully trained YOLOv4-p5 using AlexeyAB darknet framework;
Converted the trained model to .onnx using the provided script "/scripts/yolo2onnx.py"
Disabled fast-reid (according to #35 (comment));
Changed subclass YOLO to:

Changed class labels to the only class I want to detect:

And changed mot.json accordingly:

Any ideas of what might be happening? Thank you so much again for your project and kind support.

GeekAlexis commented 2 years ago

Did you set TRT_IMAGE_VERSION=21.05 when you build the image and run the docker with --gpus all? Your CUDA version 11.0 doesn't look correct in the docker, it should be 11.3 if you use 21.05.

rafaelbate commented 2 years ago

Did you set TRT_IMAGE_VERSION=21.05 when you build the image and run the docker with --gpus all?

Thanks for the reply.

I did. I ran the command as you said: sudo docker run --gpus all --rm -it -v $(pwd):/usr/src/app/FastMOT -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=unix$DISPLAY -e TZ=$(cat /etc/timezone) fastmot:latest

As for TRT_IMAGE_VERSION=21.05, I also did it. I even tried it in three different ways, just to make sure it was not the problem.

First I simply did as u suggest, docker build --build-arg TRT_IMAGE_VERSION=21.05 -t fastmot:latest ., did not work;
Second, I just modified the variable in the dockerfile itself and set it to 21.05, did not work;
Lastly, I removed the

if [[ ${TRT_IMAGE_VERSION} == 21.05 ]]; then \ CUPY_NUM_BUILD_JOBS=$(nproc) pip install --no-cache-dir -r <(grep -ivE "tensorflow" requirements.txt); \ else \ dpkg -i ${SCRIPT_DIR}/*-tf_*.deb && \ CUPY_NUM_BUILD_JOBS=$(nproc) pip install --no-cache-dir -r requirements.txt; \ fi

And tried with just CUPY_NUM_BUILD_JOBS=$(nproc) pip install --no-cache-dir -r <(grep -ivE "tensorflow" requirements.txt); \, also did not work.

Any other suggestions please?

GeekAlexis commented 2 years ago

Make sure you installed the latest nvidia docker 2. The docker image should work out of the box, your last two approaches are not necessary. This looks like a driver issue.

As a last resort, you can try reinstalling your driver or ask on NVIDIA forum. You shouldn't get that NVIDIA Driver warning.

GeekAlexis commented 2 years ago

What's the output of docker info | grep -i runtime? Can you try adding the --runtime=nvidia option when you run the docker?

rafaelbate commented 2 years ago

Make sure you installed the latest nvidia docker 2. The docker image should work out of the box, your last two approaches are not necessary. This looks like a driver issue.

As a last resort, you can try reinstalling your driver or ask on NVIDIA forum. You shouldn't get that NVIDIA Driver warning.

I've been trying many approaches, without success. I've noticed that, even when running the test proposed by nvidia:

sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

when installing nvidia-docker, I get CUDA 11.4 instead of CUDA 11.0, which is really weird.

I am using WSL 2 and thought it might be because of that, I installed Ubuntu 20.04 on my machine, tried the proposed test by nvidia, but the result is still the same. I am going to try to run custom FastMOT on Ubuntu and see the output.

Just tried --runtime=nvidia, same output.

Output for sudo docker info | grep -i runtime is

I'll keep trying different solutions, and, as you said, at the last resort I'll ask on NVIDIA forum.

Can't thank you enough for helping me, Best regards.

GeekAlexis commented 2 years ago

BTW, you should check nvcc -V for CUDA version. nvidia-smi outputs CUDA version packaged with the driver.

rafaelbate commented 2 years ago

BTW, you should check nvcc -V for CUDA version. nvidia-smi outputs CUDA version packaged with the driver.

So after installing ubuntu and setting up everything there, the warning when starting the docker container went away.

I checked CUDA version with nvcc -V on ubuntu and it is indeed using CUDA version 11.3!

Unfortunately, when I run app.py, the output is the exact same, the video with no tracking whatsoever ...

I really don't think I've missed any step to implement custom FastMOT tho ... Again, I:

Trained a model in darknet, which I verified and it's detecting with 91% accuracy;
Converted the model to onnx;
Disabled ReID
Changed class YOLOv4P5 on yolo.py file
Changed label.py
Changed mot.json

I only disabled ReID, do I need to disable anything else? Am I missing something? I am trying to track red-ish small fruits

GeekAlexis commented 2 years ago

127

rafaelbate commented 2 years ago

127

I've deleted the comment, I see that comment where you said delete the trt files and that worked. Solved trt error, no detections yet.

rafaelbate commented 2 years ago

Got it working on ubuntu after changing "Letterbox" in yolo.py from "True" to "False".

Thanks for your help Alexis.

One last question if you will, how can I count the number of unique ID occurrences along the detection/tracking process?

GeekAlexis commented 2 years ago

There is currently a bug in Letterbox preprocessing.

Can you elaborate more on counting unique IDs? I'm not sure I understand your purpose.

rafaelbate commented 2 years ago

There is currently a bug in Letterbox preprocessing.

Can you elaborate more on counting unique IDs? I'm not sure I understand your purpose.

I'm happy we found the error.

Well I need to count fruits in a video. In order to do that, I am using FastMOT as the tracking framework and wish to count the number of unique IDs, which will match the number of fruits.

rafaelbate commented 2 years ago

BTW, for future reference, switching LETTERBOX from "True" to "False" on WSL also fixed the issue, even with NVIDIA warning.

rafaelbate commented 2 years ago

Hey @GeekAlexis is the bug only for YOLOv4P5, or it is a general bug? I am getting really poor results and I think it is due to the letterbox bug.

GeekAlexis commented 2 years ago

It’s for all scaled YOLOv4 models because these are usually trained with letterbox preprocessing. Will push a fix today.

GeekAlexis commented 2 years ago

Should be fixed now.

For your question, you need to count confirmed tracks. So you can count unique ID's in mot.visible_tracks at every frame and accumulate them with a data structure like set. Or you can just output a MOT challenge log with the -l option and count the number of unique ID's in the log.

GeekAlexis / FastMOT

No video output for custom classes. #165

127

127