Hardware Configuration and .cfg network for 2 live Cameras feeds

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.75k stars 7.96k forks source link

Hardware Configuration and .cfg network for 2 live Cameras feeds #4169

Open danimod92 opened 5 years ago

danimod92 commented 5 years ago

Hello, I am working about object detection by using Yolov3 in a live streaming on camera. I am currently working on Jetson Xavier kit and, using 2 darknet instances ( built with GPU,CUDNN, CUDNN_HALF and OPENCV set to 1). I wanted to inference object detection on 2 live streaming cam with 1920x1080. I got on average 15FPS ( around 7-8 FPS per instance). I would like to achieve in the best case 60 FPS (30FPS per cam), but even a boost of few Fps would be nice as well. I'm running Yolov3 with 416,416 size in .cfg file. I tried to use Yolov3-tiny but, although I had a nice increment of Fps (around 13fps per cam) the accuracy is far away to be as close as Yolov3 (since the cam is placed at high distance and objects to detects are not big). My questions are: Since I'm using standart Yolov3 and I can't train any model with Xavier, is buying a RTX 2060 a good way to achieve at last a boost of Fps? And since this way I may train my own model, which configuration files .cfg are suitable for this kind of task? I read about this thread but don't know if it may be good for my task: https://github.com/AlexeyAB/darknet/issues/3361

AlexeyAB commented 5 years ago

Since I'm using standart Yolov3 and I can't train any model with Xavier, is buying a RTX 2060 a good way to achieve at last a boost of Fps?

Do you want to use RTX 2060 for Detection? RTX 2060 is faster than Jetson Xavier 5x times, and will process 5x more FPS.

You can use RTX 2060 for training.

For small objects:

you can use already trained yolov3-tiny.cfg with width=832 height=832 in cfg-file.
or you can try to train https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-tiny_3l.cfg

danimod92 commented 5 years ago

Thank you for the fast reply! I want to use it mainly for detection, but occasionally also for training my custom dataset later on. I already tried yolov3-tiny.cfg with such size and even higher, but it does not detect almost anything, just sporadic objects (which are supposed to be person and car/trucks/bycicle). I guess because of shadow, and small size objects (though it runs on roughly at 20fps).

ttomcat75 commented 5 years ago

In my experience for small objects (15x15 pixel and larger in FullHD-Stream) it is better to train/detect with Tiny or 3l and a higher resolution than the full Yolo model with less resolution. (mAP= 91% for 5 classes)

I use RTX2060 for detection in FullHD (1920x1088 pixel) in .cfg- File with a frame rate of about 15 fps and the RTX Titan and RTX2080 TI for training. With the RTX2080 TI I get about 50-60fps for detection in FullHD, but it is more expensive for productive use.

danimod92 commented 5 years ago

My objects to detect are not really small, just when they are far from the cam. that's a sample of a frame: cropped

@ttomcat75 which .cfg did you use for training your model and what width/height size? I probably will train with around 5-6 classes. Did you get 15FPS only with one instance or you ran 2 instances like in my purpose? Unfortunately I can't use any RTX Titan or higher GPU, mainly because for project/work constraints I have to work either with SBC devices (like Nvidia Jetson series) or at most with min pc(not desktop). Hence the the machine with highest GPU computation has only an RTX 2060.

SouradipBh commented 5 years ago

@danimod92 do you know how to increase instances more than 3? where to change it to get some effect.?

danimod92 commented 5 years ago

@SouradipBh what do you mean for 'instances'? I meant just opening another terminal and run darknet

ttomcat75 commented 5 years ago

I use the "normal" tiny .cfg- file with the following main modifications: width=1920 height=1088 angle=30 flip=1 letter_box=1 truth_thresh = 1 random=0

But they do not necessarily have to be better for you. I do not think such a high resolution is necessary because of the size of your objects. (needs a lot of memory and processing time) I use Yolo to track and classify birds with a PTZ- camera, among other things: https://youtu.be/XndlJ42HCEg

Did you get 15FPS only with one instance or you ran 2 instances like in my purpose?

Only with one instance and FullHD resolution above.

Which Jetson module do you use? I know it too little for image processing, for radar signal processing with 6 GBit/s TX2 was very powerful as an embedded solution. I would take a look at the AGX Xavier, which could hold a candle to the RTX2060.
How big is your training dataset? You need at least 2'000 labeling data per class for a good performance.

danimod92 commented 5 years ago

I watched the video and it runs so smoothly with a nice accuracy! Yes, my objects to detect are not really small as you saw in the picture I posted, therefore I was just thinking to use the Width-Height Alexey suggested me. Actually now I'm going to try to train network by using Google Colab, as I have no other choices at the moment.

1) I have the AGX Xavier 16GB set with "highest" performance configuration. The inference for once instance is not that bad (although the script is kinda heavy because it processes many information depends on objects it detects) 2) Right now I only used standard COCO dataset. I'm going to use a subset of COCO by extracting only people and car/truck/motorbike, so the training data is quite large.