I am doing inference of tiny-yolo-v3 on google collab using GPU runtime.
GPU used was Tesla P100-PCIE-16GB.
After running the darknet inference command , The predicted time shown was 0.91 seconds.
I could see from code that this time stamp is the processing time of the network on GPU which excludes pre and post processing of image.
I have created cells which contains the same results.
Now, I am little confused regarding this . I know these GPUs are very costly and gives good performance. But 0.91 seconds inference time accounts to performance of 0.9 frames/second , which is not significant.
Can anyone tell me whether I am doing something wrong here?
Or It is the actual performance of GPUs?
I know inference time depends on lot of parameters like network size etc, but how fast GPUs can process data in terms of Frames/second in networks like tiny-yolo-v3?
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
!./darknet detector test cfg/coco.data cfg/yolov3-tiny.cfg /yolov3-tiny.weights data/dog.jpg
layer filters size input output
0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 0.150 BFLOPs
1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16
2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 0.399 BFLOPs
3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32
4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 0.399 BFLOPs
5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64
6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 0.399 BFLOPs
7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128
8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 0.399 BFLOPs
9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256
10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BFLOPs
11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512
12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BFLOPs
13 conv 256 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 256 0.089 BFLOPs
14 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BFLOPs
15 conv 255 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x 255 0.044 BFLOPs
16 yolo
17 route 13
18 conv 128 1 x 1 / 1 13 x 13 x 256 -> 13 x 13 x 128 0.011 BFLOPs
19 upsample 2x 13 x 13 x 128 -> 26 x 26 x 128
20 route 19 8
21 conv 256 3 x 3 / 1 26 x 26 x 384 -> 26 x 26 x 256 1.196 BFLOPs
22 conv 255 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x 255 0.088 BFLOPs
23 yolo
Loading weights from /content/gdrive/My Drive/Darknet/yolov3-tiny.weights...Done!
data/dog.jpg: Predicted in 0.917487 seconds.
dog: 57%
car: 52%
truck: 56%
car: 62%
bicycle: 59%
Hi,
I am doing inference of tiny-yolo-v3 on google collab using GPU runtime. GPU used was Tesla P100-PCIE-16GB. After running the darknet inference command , The predicted time shown was 0.91 seconds. I could see from code that this time stamp is the processing time of the network on GPU which excludes pre and post processing of image. I have created cells which contains the same results. Now, I am little confused regarding this . I know these GPUs are very costly and gives good performance. But 0.91 seconds inference time accounts to performance of 0.9 frames/second , which is not significant. Can anyone tell me whether I am doing something wrong here? Or It is the actual performance of GPUs? I know inference time depends on lot of parameters like network size etc, but how fast GPUs can process data in terms of Frames/second in networks like tiny-yolo-v3?
/device:GPU:0'
!./darknet detector test cfg/coco.data cfg/yolov3-tiny.cfg /yolov3-tiny.weights data/dog.jpg