AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.76k stars 7.96k forks source link

Classifying cylindric batteries #5997

Open hansvas opened 4 years ago

hansvas commented 4 years ago

i have tested different architectures to classify company, brand, size* and model from cylindric batteries like the ones described here:

https://en.wikipedia.org/wiki/Electric_battery

Best results i have with densenet: 50 % - 65 % correct classifications, when the size plays no role - it makes no real difference if i train a model from the scratch or if i used pretrained weights.*

Training always looks good to me. Im sure, the problem is that all objects are only different in texture but they all share the same form (a cylinder).

The Images i have to classify are images made from a topview, cropped and scaled. So different object-sizes are very close to each other.

This is the next problem - having two or three different classes with nearly identical textures, only differ in size - where size is not really detectable (the relation length/width differs a little bit)

How to do it better?

For each class i have a series of photos, a flat image (the texture of the cylinder) made with my own "line-camera" for cyclindric objects (build from "fischertechnik").

texture

I use the flat image as a texture inside an small Programm, which renders a 3D-Model of a cylindric-battery in different angles, with different backgrounds.

round

(not the same as before)

So i can generate as many images as i want. Additionally, i have "natural" Photos of the batteries also available.

BTW: A Net which could identify a texture (out of 500 different) where only a part of the texture is given would also be fine, becouse the visible part of a battery is always a part of the flat image.

Any hints?

flowzen1337 commented 4 years ago

Hey @hansvas ,

i run into a similiar problem, i had 142 classes to train and three of them are very similar (class 0, 1, 2), just very small pieces differ but the training processed couldn't identfie this pieces (i guess they are just to small) and therefore training couldnt decide if my object-0 belongs to class 0, 1, 2 and never learned it because of no decision could be made. ( the whole story here: https://github.com/AlexeyAB/darknet/issues/4832 )

So to keep it short: I guess you can only use the YOLO to detect the batterie itself but not for small detail segmentation, i guess you need to use some other technique (OCR maybe? Color segmentation?....). (That's not a proofed statement, just my experience so far which i had and my personal conclusion of this, but maybe it helps before you will keep hunting a ghost :-) )

AlexeyAB commented 4 years ago

@flowzen1337 You just should use higher network resolution.

hansvas commented 4 years ago

@flowzen1337

It have no detection Problem, it is a classification Problem. I try to use densenet, resnet and similar architectures instead of yolo.

Ocr(ing) the images would not help. Except of the manufacturer, the words on the batteries are very similar from class to class, whats part of the batterie is visible is always different and depends from the side seen by the camera.*

I ve also tried to use keypoint detection algorithms (SIFT/SURF) but only a kind of 3D-Sift works reasonable and this uses an huge amount of computing power.

But you pointed me in a direction which can help me a little bit. I can use Yolo to classify the size of the batteries. (This works perfectly). Then i can group batteries of the same just by the texture, which reduces my classes from ~500 to ~100.

Thank you for this.

segment100001

flowzen1337 commented 4 years ago

@flowzen1337 You just should use higher network resolution.

I'll tried with subdivision=64: height=608 width=608 network size but with no luck.

I'll try now with: width=864 height=864

i cant go higher, the next 896 would already lead to out of memory issue.

@hansvas

ah ok, than i missunderstood but good that i anyhow could give you something to think about for a new approach :-)