AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

Titan V vs. GTX 1080Ti for real-time inference on HD video at 50fps #1558

Open endo123 opened 6 years ago

endo123 commented 6 years ago

We intend to do some real-time inference on a single HD resolution video stream at around 50fps. I am trying to spec. out GPUs for the system deployed onsite that will be used to do the inference and re-rendering of video with detected objects.

Would GTX 1080Ti be able to keep up or would I need to pick Titan V? Degradation of the output frame rate to 30 fps or so might be acceptable as this is a prototype to be used for demonstrations.

On another thread, I noticed a comparison these 2 GPUs but for multiple live streams, so some parallelism was also coming into play in that case.

Any recommendations/pointers are appreciated.

Best, Vineet

P.S: Copying @kmsravindra as he is collaborating w/ me on this project.

AlexeyAB commented 6 years ago

I think GTX 1080Ti should be enough to process FullHD (1920x1080) 50 FPS by using yolov3.cfg (416x416).

Because it can processes 2 x 1920x1080 25 FPS: https://github.com/AlexeyAB/darknet/issues/1232#issuecomment-405565193

On another thread, I noticed a comparison these 2 GPUs but for multiple live streams, so some parallelism was also coming into play in that case.

Single Yolo model can occupy ~95% of GPU - if you use this repository, OPENCV=1 CUDNN=1, and modern CPU. There can be a bottleneck only on a CPU-side (video decompressing, resizing, saving) if you use other repo or slow CPU.


Titan V is required if you want to achive about ~90 FPS on 1920x1080 video and 416x416 network size.

endo123 commented 6 years ago

Thanks, @AlexeyAB !

How do the performance requirements vary w/ increase in network size? We may need a bigger network size to allow for better smaller object detection in our dataset.

P.S: @kmsravindra

AlexeyAB commented 6 years ago

How do the performance requirements vary w/ increase in network size?

performance requirements linearly proportional to the product of numbers network_width x network_height

kmsravindra commented 6 years ago

@Alexeyab, Just to confirm, we plan to use 832 x 480 network size whose product is 2.3 times bigger than 416x416. So can we approx assume 50/2.3 = 21.7Fps for this network size for 1080Ti?

Also, from your other thread I am assuming yolov2-light - yolov3 would be 1.3 times faster @ 1% mAP trade-off. So, hence using this lighter yolov3 should pump it up to 21.7 *1.3 = approx 28 FPS?

AlexeyAB commented 6 years ago

@kmsravindra In general yes.

So can we approx assume 50/2.3 = 21.7Fps for this network size for 1080Ti?

Yes. But this is only the assumption, that GTX 1080Ti will have about 50 FPS on yolov3 416x416, since I didn't test it on GTX 1080Ti.

From the other hand, I got only 32 FPS on Tesla V100 (~Titan V) without Tensor Cores, and 90 FPS on Tesla V100 (Titan V) with Tensor Cores, so may be there is somewhere a bottleneck on GPU, so GPU usage can be less than 90% without Tensor Cores: https://github.com/AlexeyAB/darknet/issues/407


Also, from your other thread I am assuming yolov2-light - yolov3 would be 1.3 times faster @ 1% mAP trade-off. So, hence using this lighter yolov3 should pump it up to 21.7 *1.3 = approx 28 FPS?

To do this, you should use -quantized flag at the end of command, and you should use this input_callibration= param in your cfg-file: https://github.com/AlexeyAB/yolo2_light/blob/29905072f194ee86fdeed6ff2d12fed818712411/bin/yolov3.cfg#L25

kmsravindra commented 6 years ago

Thanks for the info @AlexeyAB