YOLOv5 - inference/detection time when using a single CPU for nano and small weights, 1280 img size

valentinitnelav commented 2 years ago

I was actually curious to find out what is the inference/detection speed of a nano and small models while using a single CPU and 4 Gb of RAM (which simulates roughly a powerful smartphone I presume).

For example for a test image of 2048 x 2048 pixels (Diptera_Syrphidae_Blera_fallax_2802649646_1111831.jpeg), which gets converted to 1280 x 1280 at detection time, the speeds are:

nano: 0.785s
small: 2.299s

I think we might need some nano GPU for the field image data collection.

I am running now a batch test on an unseen Syrphidae dataset of about 1091 images (of different resolutions) to get average results.

nano: job finished in 11:45 min = 705 sec => 705/1091 sec per img => on average 0.646 sec/img (might be higher for higher resolution images)

2022-07-19T09:56:18: Slurm Job_id=3211305 Name=yolov5_infer_cpu Ended, Run time 00:11:45, COMPLETED, ExitCode 0 Speed: 4.6ms pre-process, 593.7ms inference, 1.5ms NMS per image at shape (1, 3, 1280, 1280)

small: job finished in 32:04 min = 1924 sec => 1924/1091 sec per img => on average 1.764 sec/img (might be higher for higher resolution images)

2022-07-19T10:16:37: Slurm Job_id=3211306 Name=yolov5_infer_cpu Ended, Run time 00:32:04, COMPLETED, ExitCode 0 Speed: 3.9ms pre-process, 1714.9ms inference, 1.0ms NMS per image at shape (1, 3, 1280, 1280)

stark-t commented 2 years ago

ok... do you think for the field images, and especially for the instect detector (selecting only images with insects) a yolo v5 nano with 640pixels would be possible, or should actually have a closer look into even smaller object detectors?

valentinitnelav commented 2 years ago

I think having models for a resolution of 640 x 640 would be more pragmatic. We can also compare in the paper with the results for 1280 and compare inference speed and accuracies.

I am worried that looking into even smaller models like FOMO stretches our time for now and this can also be done later in the following papers. Just setting the environments on the cluster ate a lot of time already and I am still waiting for support with detectron2.

The huge bottleneck for the field images will be rather from the optics point of view: lenses are not properly focused, not the optimal distance from flower resulting in insects occupying a small proportion of the image and rendering them not detectable, etc. A lot of these will have to be solved via a collaboration with someone who can develop a custom camera system for us (which might involve using a nano GPU as well).

valentinitnelav commented 1 year ago

See #61 now

stark-t / PAI

YOLOv5 - inference/detection time when using a single CPU for nano and small weights, 1280 img size #42