How to Detect Small object from Far away using Yolov9

WongKinYiu / yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

GNU General Public License v3.0

9.04k stars 1.44k forks source link

How to Detect Small object from Far away using Yolov9 #308

Open hammad2008 opened 8 months ago

hammad2008 commented 8 months ago

Question I have the video I want to detect the person moving in Greeny. Can anybody help me detect them?

I have encircled the person i want to detect in image.

7963d20e-0f13-456e-8a60-7b111065effc

Here is my video

https://github.com/WongKinYiu/yolov9/assets/75312856/afae151f-d75a-445d-9a49-2972a812d367

levipereira commented 8 months ago

You can try https://github.com/obss/sahi

hammad2008 commented 8 months ago

@levipereira i have go through this. But there is no code for SAHI yolov9 on video exist. Can you help me with that?

levipereira commented 8 months ago

Try increase network resolution (e.g 640 to 1280) and use large models such as YOLOv9-E

hammad2008 commented 8 months ago

I have tried but results are not promising. Can you please use SAHI with yolov9?

dsbyprateekg commented 8 months ago

@hammad2008 you should use HankAI's darknet/yolov4 instead of other versions for small object detection. Check this video- https://www.youtube.com/watch?v=861LvUXvJmA&t=14s

The benefit of using darknet/yolov4 is smaller training time and better results.

guillermo-gabrielli-fer commented 8 months ago

Consider finetuning in Visdrone. They have many small person detections (they have both pedestrian and people class, not sure what's exactly the difference).

SAHI should be adaptable to YOLOv9, there is a PR for that on their repository but doesn't pass all checks yet.

JFMeyer2k commented 8 months ago

Hey hammad08,

I had the same issue. The video I work with ranges between 2K and 4K. The objects I want to detect are often (in the original image) below 10x10 px). I wanted to avoid the hassle of adapting SAHI myself, plus YOLO updates regularly, and making changes to SAHI is something I wanted to avoid (probably you just have to do it once but I ended up doing my own thing). I wrote a code function separating the input images into tiles (currently, I use 640x640 px) to ensure that the input image does not get re-scaled. I train the YoloV9-e model based on those tiles, create predictions based on tiles, and piece them back together afterward. That is (more or less) what SAHI is doing, but since I only track a few objects, the code that does the transformations is minimal and maintainable. I also tested and compared this approach using YoloV4, YoloV7, and YoloV9, and I saw significant improvements every time due to the newest implementation of Yolo.

hammad2008 commented 8 months ago

@JFMeyer2k Thats great, Can you share the code with me so i can also test that?

icaroryan commented 7 months ago

@JFMeyer2k Great job! Could you share the code?