AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.75k stars 7.96k forks source link

How can improve this detection? (Small fixed size objects close to each other) #3763

Open Scolymus opened 5 years ago

Scolymus commented 5 years ago

My idea: I have images of the size of 1500-1600x800-900. In particular, I have up to 72 images for training. These images are RGB, but were recorded using a monochrome camera, so it shouldn't affect to go back to black and gray images. And in each of these images I have many (from 20 to 200) objects of the size of ~ 9x9 px or 10x10px, which can be of two classes (to have an idea, if I pick all the objects I classify I'd have around 1500 objects for one class and 3500 for the other class.). To detect them, I'm using yolov3-tiny with a size of 960x960 and before starting the program, I split my original images in 2 rows x 4 columns, so each photo is of the size ~ 400x400px (it could be not square), but still with many objects of the size of 9x9. Training set is thus made of 576 images. Training this dataset gives me a loss of 2. Later, when detecting, I use the full images 1500-800 with a higher size when I use the blob function. I detects many of my objects, although not all of them with a threshold of 0.3. But, when these objects contact each other is a disaster. I'm particulary interested in not lossing my objects after colliding between them since I want to track them later (they cannot hide between them, these are particles moving in 2D). How can I improve my detection? Should I play with anchors since they always have the same size?

About my objects: These are particles half white, half black. I want to distinguish them from totally white particles. Sometimes, there can be totally black particles, which I haven't classified them. Should I? Since it is only a matter of color change, I've disabled exposure, hue and saturation random settings. There is only for angle, which is set at 5.

Image examples: Here you have an example of these images. The first, with the two classes. In the second I squared differently both classes. In the third, I show this contact effect.

not squared squared collapse1 collapse2

Thank you!

Scolymus commented 5 years ago

Sorry to interrumpt you @AlexeyAB, but I just calculated the anchors and I got this:

num_of_clusters = 9, width = 960, height = 960 read labels from 488 images loaded image: 488 box: 6821 all loaded.

calculating k-means++ ...

iterations = 3

avg IoU = 99.88 %

Saving anchors to the file: anchors.txt anchors = 25, 23, 25, 23, 25, 23, 25, 23, 25, 24, 25, 23, 24, 24, 25, 24, 25, 24

There are only 3 combination of anchors and they are always smaller than 30x30. Should I put only these 3 unique combinations? What about the first and second layer of yolo? Should I remove all the code previously to the third layer in the yolo3.cfg? Should I just change the anchors for the third layer with these unique combos? I was trying with the full yolov3.cfg following https://github.com/AlexeyAB/darknet/issues/3198 Any other clue for objects that are close each other? Also, another doubt: I'm putting some width w and height h (960x960) but then, when I train, darknet is printing "resizing ww x hh" where ww and hh are bigger than w and h. Does this mean that is taking ww and hh as real width and height? Why?

fogx commented 5 years ago

72 images is not enough. you need 2000+. The anchors are not your problem. It does not look like you need to use yolo at all for your issue. Use normal object detection and ML algorithms instead