Closed levipereira closed 3 years ago
The default config is for slow pedestrian tracking so you can play with these parameters described in https://github.com/GeekAlexis/FastMOT/issues/76
Let me know if it works.
Yes, it works. I need understand what these parameters means, but will figure out.
"kalman_filter": {
"std_factor_acc": 20.5,
"std_offset_acc": 100.5,
I need to do fine tuning to calibrate it, but you showed me the way.
I'm using generic feature extractor model. The REID is reusing same ID with different vehicles. What parameter I can change to avoid this. I have tried play with below parameter but no success.
"multi_tracker": {
"max_age": 15,
"age_factor": 14,
Video with new parameters. Output Video Here
OSNet is not accurate on vehicles. You can lower max ReID cost to make ReID stricter: https://github.com/GeekAlexis/FastMOT/blob/faf36a8e51d36ebee9817ed1006c3f6881bcbde6/cfg/mot.json#L45
max_age
and age_factor
are not related to this.
For fast moving vehicles you can also increase the half life period (in seconds) for velocity decay (maybe 10): https://github.com/GeekAlexis/FastMOT/blob/faf36a8e51d36ebee9817ed1006c3f6881bcbde6/cfg/mot.json#L61
This worked.
Will start training Fast-REID to vehicle to get better accuracy on REID.
I'm running on i7-8700 (6 cores) and RTX 2060
CPU stuck in 100% busy, but video still at ~90 fps and no freeze. GPU get 25% busy.
Will try to find root cause of CPU get 100%.
Thank you for your time.
Starting Docker with theses two Parameters, Increase perfomance to 100fps and 70% CPU busy.
-e OPENBLAS_MAIN_FREE=1 -e OPENBLAS_NUM_THREADS=1
Thanks. Good to know. Should these be in included in the dockerfile? Why would FPS improve with less threads?
On Tue, Apr 27, 2021 at 2:33 PM Levi Pereira @.***> wrote:
Starting Docker with theses two Parameters, Increase perfomance to 100fps and 70% CPU busy.
-e OPENBLAS_MAIN_FREE=1 -e OPENBLAS_NUM_THREADS=1
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeekAlexis/FastMOT/issues/91#issuecomment-827945495, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGNJO5QBBRQXDBLYRB2IVMLTK4USNANCNFSM43VK6GDQ .
It's a numpy issue. This should be recommended if user get 100% CPU usage running a simple code. I already use Deep Sort and and faced same issue and fixed the issue using both variables.
https://stackoverflow.com/questions/38659217/numpy-suddenly-uses-all-cpus https://shahhj.wordpress.com/2013/10/27/numpy-and-blas-no-problemo/
Why would FPS improve with less threads?
Numpy on DeepSort code have poor behavior when using multiprocessing. Setting this Variables numpy have better performance than without it.
I'm compiling ATLAS with LAPACK and Linking with Numpy. I'll check perfomance impact with this setup.
There is some flag to measure execution time spent by DeepSort?
You can use the verbose -v
flag and check association time in seconds at the end.
Any update?
Hi Alexis, I'm pretty busy these days with some datasets and I haven't had time for this problem. But it is in my notes to try to solve this problem in the next few days. I managed to compile the Atlas, but it is not the best thing to do because there are already more recent libraries such as OpenBlas that should be used. I need to debug the code and see which part of the code is responsible for the high consumption of resources.
Another question is: I'm trying to REID People and Cars. At REID I will need a model for people and another for cars. At YOLO I can detect people and cars at once. But to extract features I will need 2 Inferences, one for cars and another for people. I need direction about how we can implement this, if you help me with ideas I will be grateful.
No rush. Thanks for investigating the issue.
But to extract features I will need 2 Inferences, one for cars and another for people. I need direction about how we can implement this, if you help me with ideas I will be grateful.
The tracker logic currently associates all detection from different classes at once. You can split it up and associate by class, which requires the most change. You also need to create two FeatureExtractor instances and extract ReID features by class but this might be slow. FPS wise, it's recommended to train your ReID network on both classes if possible.
The tracker logic currently associates all detection from different classes at once. You can split it up and associate by class, which requires the most change. Or you can create two FeatureExtractor instances and extract ReID features by class but this might be slow. FPS wise, it's recommended to train your ReID network on both classes if possible.
Ty for feedback.
I think the most rasonable solution is split classes due perfomance issues.
https://github.com/JDAI-CV/fast-reid/issues/329#issuecomment-751968321
I know it is a big change but it is a price that I will have to pay.
For more classes, it's not a scalable solution. I don't think model input size is an issue because images of different resolutions have to be resized anyway. You can use 256x256 for both person and vehicle ReID. Since you are running on a desktop and only have two classes, it's fine to split by class.
I suggest you create two FeatureExtractor instances, split the detections from YOLO by class, and feed them into the corresponding FeatureExtractor. Finally, concatenate the feature vector output at the end. This is probably the easiest way and you don't have to modify the association step.
FYI, the recent commit e33596afc8d8bb9665177415dc47e85298954c48 includes the two OpenBlas flags in the dockerfile. This significantly improves FPS by more than 2x.
For those that visit this issue in the future, these two parameters should be raised first if tracking fast small objects doesn't work: https://github.com/GeekAlexis/FastMOT/blob/2b0e531009d716994230a995ac783c85f728c392/cfg/mot.json#L58-L59
Closing this now since the original issue is resolved. Feel free open a new issue for other questions.
Hi @GeekAlexis, I found the root cause of my CPU get stuck in 100%.
Issues:
solved Part of the problem was that numpy conflict with OpenBLAS multi-threading. So the best option was disable multi-threading in Openblas. https://github.com/xianyi/OpenBLAS/wiki/faq#multi-threaded
The fastmot resize the input frame in CPU here VideoIO: I was processing videos of 4K resolution (3840 x 2160/60fps) The CPU i7-8700 can handle it but get stuck in 100%.
We can improve this piece of code by encode,decode and resizing in GPU instead CPU.
Starting with OpenCV 4.5.2 new properties are added to control H/W acceleration modes for video decoding and encoding tasks
https://github.com/opencv/opencv/wiki/Video-IO-hardware-acceleration
As you are using gstreamer pipeline, we can explorer nvcodec in gstreamer https://gstreamer.freedesktop.org/documentation/nvcodec/index.html?gi-language=c
I'm building OPENCV 4.5.3 with ffpmeg 4.4 compiled with NVDEC.
P.S do not use the master branch of FFMPEG due a bug with OPENCV.
About ffmpeg and nvdec https://docs.nvidia.com/video-technologies/video-codec-sdk/ffmpeg-with-nvidia-gpu/
Performance of decode/encode over NVDEC https://developer.nvidia.com/nvidia-video-codec-sdk
Just be aware that Non-Enterprise GPU Cards have limited number of NVDEC and Concurrent sessions https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new
This can really improve performance of fastmot in case of high resolutions decode/resize/encode.
Cheers, Levi
@levipereira The environment can be hard to set up with nvcodec. It would be great if you can provide a working version of Dockerfile with GPU accelerated FFMPEG.
It's work and is fast to decode/encode, my next step is test decode and resize in gpu. This can realy help us. My dockerfile is a mess, I need clean it before send you. I'll try build a small image with opencv and ffmpeg and gstreamer with H/W acceleration enabled
Hi @GeekAlexis , Bad news... I did tons of tests using OPENCV with ffmpeg NVDEC with RTX 2060 and I could get only 65 Fps (70%GPU utilization) with 4k video while in CPU i7-8700 i got 170 FPS (100% utilization). Using only FFMPEG and NVDEC i got 220Fps. I think there is some issue between ffmpeg and opencv when using HA, because that I think is not good implement for now OPENCV with NVDEC, but I have not tested ENCODE (NVENC).
These libraries seems to be interesting.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html https://docs.nvidia.com/jetson/l4t-multimedia/group__LibargusAPI.html
Hi @levipereira , could you please share the code for splitting the reid for two classes? also pls let me know the performance.
@noorafathima7 The feature is already supported on the master branch. You can add multiple ReID models for different classes. FPS will drop a bit compared to a single ReID model.
@GeekAlexis , Thanks for the reply. my intention is to track car, bus, truck, bicycle and human. is 2 reid models enough? like one for vehicles (for eg: veri wild) and the osnet reid model for human. if yes, how will the model associate the tracking id to the detected object since it can identify only 2 classes,ie, vehicle and human.
In that case I suggest you split the detections between vehicles and human so that you can use only 2 reID models. This only works if all your vehicle class IDs are consecutive.
Assuming class IDs are in the same order you listed them, you can write a custom find_split_indices()
to split between bicycle and human here:
https://github.com/GeekAlexis/FastMOT/blob/9aee101b1ac83a5fea8cece1f8cfda8030adb743/fastmot/mot.py#L146-L148
Also get rid of the assertion: https://github.com/GeekAlexis/FastMOT/blob/9aee101b1ac83a5fea8cece1f8cfda8030adb743/fastmot/mot.py#L84-L85
FastMOT does not associate objects with different classes unless you remap different class IDs to the same one.
Thank you! will try this.
Hi @GeekAlexis, The custom find_split_indices() function you mentioned above is to select the feature extractor model based on the detections right?.
my class ids are in this order: 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck',
correct, you can split at class ID 1 so the rest of the classes will use the vehicle feature extractor
Hi @GeekAlexis, I loved your project and included it in my studies. I'm doing vehicle tracking and calculating real-time speed among other types of analysis. As a crucial part of my project it includes Object Detection and Tracking. FastMOT is perfect for starting my project. I've done all Yolov4 Darknet Detection training and I'm going to do Fast-REID Training. I did some tests with aerial video images and I have some questions. I used it as Feature Extractor OSNet025, apparently it works very well.
However, small objects at high speed, the object changes its identity very often, I believe this is due to KLT optical flow tracking.
Do you have any clue as to how I can fix this?
Check Videos Output using:
and
Output Video Here