How to detect smaller objects with custom object datasets

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.65k stars 7.96k forks source link

How to detect smaller objects with custom object datasets #6426

Closed praneeth0609 closed 1 year ago

praneeth0609 commented 4 years ago

Hello @AlexeyAB, first of all, I want to thank you for the great works and now I need your suggestion. Currently, I'm working on custom object detection like birds, etc. I need to detect small objects also. but am not able to detect small objects. To get detect small objects, what I need to do. Here are my questions:

How many images required per class ? (am using 2 classes.)
For smaller object detection what I have to do ?
even the object is very small in the frame. it should be detected . for this any suggestions?
If we take one bird as an example, how many images required for specific angle of bird(front, back, up, bottom) ?
what is the minimum object size that i can detect using yolo. ?
For smaller object detections, what changes I have to do to my custom object dataset ?
Any suggestions for improving the mAP ?

praneeth0609 commented 3 years ago

@stephanecharette thank you for your replay. if I go with RTX 2080 super 8 GiB or with 12 GiB, can we expect good results..?

and am unable to detect smaller objects with yolov4. suggest some tips related to configuration file parameters of yolov4.

Thank you.

stephanecharette commented 3 years ago

Like I said, the amount of memory won't change the results. The only thing you can do with more memory is tweak the number of images loaded into memory during training, thus cutting down the time it takes to train. The results will be 100% the same.

If you cannot detect things, then you may want to consider switching to the 3-layer tiny model. And tell us:

the size of the network
the size of the original images
the size of the objects you are trying to detect

praneeth0609 commented 3 years ago

@stephanecharette thanks for your replay.

as you asked in the above post. am mentioning my requirements.

size of the network : 608x608
size of the original image: 1920x1080
the size of the object: 20x20/30x30/20x30 in 1920x1080 image.

and another combination am testing.

size of the network : 416x416
size of the original image: 1920x1080
the size of the object: 30x40/40x40 in 1920x1080 image.

in both cases am using yolov4 object detection. please let me know the possibilities and some procedure to achieve the above scenario.

stephanecharette commented 3 years ago

When you resize your 1920x1080 image to 608x608 (did you choose to not maintain your aspect ratio?) then it means a 20x20 object will be resized to 6x11 px in size. It think it is highly unlikely you'll get good results with that.

With your 416x416 network, then a 30x40 object within a 1920x1080 image means the object will measure ~6x15 px.

praneeth0609 commented 3 years ago

@stephanecharette thanks a lot for your replay.

as you mentioned , we can detect the objects even it has 6x11 pixels in 608x608 image. but am not able to detect. can you help me further..?

here am attaching my yolov4 config file. if possible please check .cfg file and let me know what changes to be done. if you want to check my weight files i will share my weight files also.

yolov4.zip

stephanecharette commented 3 years ago

as you mentioned , we can detect the objects even it has 6x11 pixels in 608x608 image. but am not able to detect. can you help me further..?

In the same sentence you say you can but you cannot detect? What do you mean?

can you help me further..?

I would try the configuration file AlexeyAB has made specifically to help people who want to detect small objects: yolov4-tiny-3l.cfg.

Other than that, another option I've done is to crop the image to keep only the area of interest. I have an example of that one here: https://youtu.be/7yN044S4UZw

And lastly, I incorporated tiling large images for a project in the past where I couldn't crop or increase the network dimensions. Support for tiling was added to DarkHelp specifically to help with this. You can see how that works here: https://www.ccoderun.ca/darkhelp/api/Tiling.html