Some problems with tracking

hadign20 commented 5 years ago

I am trying to track vehicles with recent version of yolo_console_dll but i am facing some problems:

The detection results using darknet.exe detector demo is much better than it is in tracking. I realize in tracking the detection is not performed on each frame. Is it possible to make it run detection on more frames?
some times the tracking box stays even after the object has left the scene.

this is detection results: ezgif com-video-to-gif(1)

but the tracking is like this: ezgif com-video-to-gif

How to reduce the size of colored boxes containing labels of the detected objects? Is it possible to make them transparent?
The fps of the resulting video is the same as the input video, but the resulting video is a bit slower.

Thanks

buzdarbalooch commented 5 years ago

in my case it was really fast. Fatsrer then the normal input video, tracking video. whats the frame story and max distance u r using

buzdarbalooch commented 5 years ago

how long is ur video?

hadign20 commented 5 years ago

@buzdarbalooch I change here std::max(35, video_fps) to std::max(1, video_fps): https://github.com/AlexeyAB/darknet/blob/0543278a5bd7064fae6538afd1761b06b10f73ee/src/yolo_console_dll.cpp#L296

The resulting video that you see while running the program is different than the one that is saved as result.avi. The one you see is faster. My fps is 15 and the video is about 15 minutes long.

AlexeyAB commented 5 years ago

@hadi-ghnd

Optical-flow tracker should be used for video-stream from Camera instead of Video-file if your GPU can't process each frame from Camera. Optical-flow allows to achive more FPS than Neural Network, it tracks objects between detections.

Comment this line to do Detection for each frame: https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L294

some times the tracking box stays even after the object has left the scene. - to fix it, set here true instead of false https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L509

Instead of TRACK_OPTFLOW you can use the Kalman filter if all your objects have linear trajectory - just set true instead of false: https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L290 And use uselib instead of uselib_track (and comment #define TRACK_OPTFLOW)

To reduce the colored rectangles and text size you should change these lines: https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L200-L203

use std::max(1, video_fps) instead of std::max(35, video_fps) to for video-files https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L369

hadign20 commented 5 years ago

@AlexeyAB thank you for your helpful response. I tried Kalman filter as you suggested and the results got better: ezgif com-video-to-gif(2)

But I still have some questions:

As you can see there are double detections for some objects. Can this be solved by tracking or increasing nms?
some times the tracking box stays behind the object. I think this can be because of the Kalman filter. Is there an option to fix this like the one you mentioned for TRACK_OPTFLOW?
My last question: Is there an option to count the objects of each class or should I add it myself?

AlexeyAB commented 5 years ago

As you can see there are double detections for some objects. Can this be solved by tracking or increasing nms?

some times the tracking box stays behind the object. I think this can be because of the Kalman filter. Is there an option to fix this like the one you mentioned for TRACK_OPTFLOW?

No. You should train your model better. Collect much more images from such video with the same point of view and the same relative sizes of objects. And read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection Or you should use better model, for example yolov3-spp.cfg / weights

You should implement it by yourself - some hints:

Use std::vector<int> track_id_vec; https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/include/yolo_v2_class.hpp#L654
track_id_vec(classes_number) pass here number of classes https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/include/yolo_v2_class.hpp#L809

Instead of https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/include/yolo_v2_class.hpp#L958-L959 use:

                track_id_state_id_time[i].track_id = ++track_id_vec[result_vec_pred[i].obj_id];
                result_vec_pred[i].track_id = track_id_vec[result_vec_pred[i].obj_id];

..... etc

buzdarbalooch commented 5 years ago

Dear @AlexeyAB and @hadi-ghnd . i shall be greatful if u can help me debug the tracking issues i am face.

i have experimented on two datasets of football(soccer), ap of both datasets are below. 1st datset is for four classes, second is for seven classes.

Below are the result when i try to run tracking: LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH ./uselib data/obj2.names cfg/yolov2.cfg Backup1/yolov2_3700.weights cornerkick1.mov

frame_id = 184 track_id = 441, obj_id = 3, x = 57, y = 235, w = 6, h = 143, prob = 0.316

track_id = 446, obj_id = 3, x = 244, y = 103, w = 3, h = 268, prob = 0.273

track_id = 276, obj_id = 3, x = 51, y = 116, w = 0, h = 156, prob = 0.254

track_id = 297, obj_id = 3, x = 55, y = 333, w = 0, h = 167, prob = 0.254

track_id = 443, obj_id = 3, x = 251, y = 174, w = 67, h = 149, prob = 0.253

track_id = 540, obj_id = 3, x = 244, y = 9, w = 0, h = 245, prob = 0.252

track_id = 263, obj_id = 1, x = 251, y = 301, w = 0, h = 120, prob = 0.262

track_id = 409, obj_id = 3, x = 154, y = 356, w = 0, h = 119, prob = 0.247

track_id = 270, obj_id = 1, x = 152, y = 172, w = 0, h = 250, prob = 0.285

track_id = 277, obj_id = 1, x = 156, y = 272, w = 0, h = 178, prob = 0.292

track_id = 535, obj_id = 3, x = 152, y = 69, w = 0, h = 250, prob = 0.227

track_id = 1503, obj_id = 0, x = 1168, y = 77, w = 167, h = 0, prob = 0.271

track_id = 542, obj_id = 3, x = 933, y = 32, w = 0, h = 208, prob = 0.221

track_id = 572, obj_id = 3, x = 464, y = 186, w = 175, h = 0, prob = 0.221

track_id = 95, obj_id = 3, x = 251, y = 57, w = 0, h = 282, prob = 0.217

track_id = 1516, obj_id = 0, x = 867, y = 131, w = 168, h = 0, prob = 0.314

track_id = 1509, obj_id = 0, x = 767, y = 131, w = 164, h = 0, prob = 0.295

track_id = 444, obj_id = 3, x = 48, y = 81, w = 0, h = 118, prob = 0.212

track_id = 1671, obj_id = 0, x = 760, y = 408, w = 73, h = 79, prob = 0.23

track_id = 543, obj_id = 3, x = 146, y = 38, w = 0, h = 212, prob = 0.211

track_id = 573, obj_id = 3, x = 245, y = 429, w = 0, h = 83, prob = 0.208

track_id = 1795, obj_id = 0, x = 48, y = 188, w = 224, h = 0, prob = 0.214

track_id = 1052, obj_id = 0, x = 1077, y = 635, w = 130, h = 35, prob = 0.415

track_id = 1733, obj_id = 0, x = 1042, y = 186, w = 187, h = 0, prob = 0.218

track_id = 266, obj_id = 2, x = 1039, y = 11, w = 0, h = 145, prob = 0.201

track_id = 278, obj_id = 1, x = 648, y = 233, w = 0, h = 135, prob = 0.278

if u notice, system is producing wrong bounding boxes for objects even with very less probability

buzdarbalooch commented 5 years ago

another error which i caught is below , when i try to implement tracking on aa image. Here, if u notice the command i execute and the weight it loads are different , weights are loaded from the previous model i trained. see the weights the system loads Loading weights from backup/yolo_10100.weights...

akhan@tensorflow-System-Product-Name:~/darknet-master$ LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH ./uselib data/obj2.names cfg/yolov2.cfg Backup1/yolov2_3700.weights experi.jpg Used GPU 0 layer filters size input output 0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 0.299 BF 1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32 0.006 BF 2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64 1.595 BF 3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64 0.003 BF 4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF 5 conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64 0.177 BF 6 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF 7 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128 0.001 BF 8 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF 9 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF 10 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF 11 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256 0.001 BF 12 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF 13 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF 14 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF 15 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF 16 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF 17 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512 0.000 BF 18 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF 19 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF 20 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF 21 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF 22 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF 23 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 3.190 BF 24 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 3.190 BF 25 route 16 26 conv 64 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 64 0.044 BF 27 reorg / 2 26 x 26 x 64 -> 13 x 13 x 256 28 route 27 24 29 conv 1024 3 x 3 / 1 13 x 13 x1280 -> 13 x 13 x1024 3.987 BF 30 conv 45 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 45 0.016 BF 31 detection mask_scale: Using default '1.000000' Total BFLOPS 29.343 Loading weights from backup/yolo_10100.weights... seen 64 Done! input image or video filename: Time: 0.200338 sec : cannot connect to X server

hadign20 commented 5 years ago

@AlexeyAB Thank you for your suggestions. The modifications you mentioned were enough to count instances of different classes.

laclouis5 commented 4 years ago

@AlexeyAB I got a dataset with a bunch of videos with linear movements (the camera is hovering above and facing the ground in a linear pattern) so I give a try to Kalman Filter and it seems to work pretty well.

False Positives are greatly reduced and box sizes are more consistent and stable. Currently I'm using your C++ library through yolo_console_dll.cpp and I'm wondering if Kalman Filtering is available through the Python API or if I must implement it myself using OpenCV?

Last question, are the changes I made to yolo_console_dll.cpp to enable Kalman Fitering passed on yolo_cpp_dll.dll (through this API: yolo_v2_class.hpp) i.e when using Detector class, are track_ids tracked by Kalman Filter?

AlexeyAB commented 4 years ago

@laclouis5

False Positives are greatly reduced and box sizes are more consistent and stable. Currently I'm using your C++ library through yolo_console_dll.cpp and I'm wondering if Kalman Filtering is available through the Python API or if I must implement it myself using OpenCV?

You should implement it by yourselft using OpenCV.

Last question, are the changes I made to yolo_console_dll.cpp to enable Kalman Fitering are passed on yolo_cpp_dll.dll (through this API: yolo_v2_class.hpp) i.e when using Detector class, are track_ids tracked by Kalman Filter?

No. File yolo_console_dll.cpp is used only for ./uselib. It isn't used in yolo_cpp_dll.dll

AlexeyAB / darknet

Some problems with tracking #2808