Some questions - Githubissues

Deadmin1 commented 5 years ago

Heay Alexey and Com,

i have some questions. Im training a tiny-yolo for cone detection on a racetrack. Of course there are many and if they are far away they are small. I got it working fairly good. Changed my testset a little bit with more little cones marked to recognize them to but this is increasing false recognitions.

The normal Yolov3 is sadly to slow on the hardware we are intending to use so this is why i try the tiny-verison

cone-detection

I changed the height and width to 706 and recalculated the anchors with a cluster of 10. This is training at the moment and i hope thats doing better and i know this makes training slower and more performance hungry.

-does this make the recognition also slower? -which parameters i should tune to recognize many small items

I also tried the XNOR one. But it gives me really bad results compared to without XNOR and i dont know why.

Also every tip is appreciated

AlexeyAB commented 5 years ago

@Deadmin1 Hi,

I changed the height and width to 706 and recalculated the anchors with a cluster of 10.

width and height must be bultiple of 32, so you can't use 706 Use width=704 height=704

-does this make the recognition also slower?

Yes, higher resolution - higher accuracy for small objects - slower training and detection.

-which parameters i should tune to recognize many small items

I just added several model for small objects. So try to use this Tiny model with 3 [yolo]-layers, and set width=704 height=704 before training: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3_5l.cfg Also recalculate anchors: darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 704 -height 704

Change anchors in all 3 [yolo] layers.

And train it by using pre-trained weights file yolov3-tiny.conv.15: https://github.com/AlexeyAB/darknet#how-to-train-tiny-yolo-to-detect-your-custom-objects

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

for training for both small and large objects use modified models:

Full-model: 5 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3_5l.cfg

Tiny-model: 3 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-tiny_3l.cfg

Spatial-full-model: 3 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-spp.cfg

Deadmin1 commented 5 years ago

Thank you for your help. This project is amazing and your quick response and help is so impressiv.

Im training the tiny 3 yolo layers right now. If you want i can give you a response how this is working.

And for my question 706 height and widht was a typo. I used 704 like you said. I also upped the hue, saturation and exposure a little bit.

AlexeyAB commented 5 years ago

If you want i can give you a response how this is working.

Yes, it might be good.

Then try to check mAP on trained models: old yolov3-tiny and new yolov3-tiny-3l on Validation or Training dataset (if you don't have validation dataset).

kmsravindra commented 5 years ago

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

for training for both small and large objects use modified models: Full-model: 5 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3_5l.cfg

Tiny-model: 3 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-tiny_3l.cfg

Spatial-full-model: 3 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-spp.cfg

@AlexeyAB , Thanks for adding these models.

So far I have been using yolov3.cfg.

Could you please give me some sense of in what ways is yolov3_5l or yolov3_spp different from yolov3.cfg - more in terms of

inference times on live video HD stream
mAP scores,
How can I use yolov3_5l.cfg with darknet53.conv.74 pre-trained files? Do I need to use the darknet partial command? Could you let me know the exact command? assuming I use the same network size for all three varieties?

In addition to the above performance differences, I have some questions related to the differences between these model architectures -

Is the difference between yolov3.cfg and yolov3_5l is more in terms of taking the output at 3 places versus at 5 places (and averaging them) but both are architecturally the same networks?
Could you explain the difference between yolov3.cfg and yolov3-spp.cfg in similar terms?

Thanks a lot!

AlexeyAB commented 5 years ago

@kmsravindra https://pjreddie.com/darknet/yolo/

YOLOv3-608 - 57.9% mAP
YOLOv3-spp - 60.6% mAP

yolov3_5l.cfg is for very small and large objects: points on the sky and full screen face.

Use darknet53.conv.74 for yolov3_5l.cfg as usual.

yolov3_5l is slower than yolov3.cfg. Backbon architecture the same. yolov3_5l takes results from 5 places and concatenates them (not averaging), while yolov3 uses only 3 places.

yolov2 - uses maxpool subsampling
yolov3 - uses convolutional stride=2 subsampling
yolov3-spp - uses both maxpool subsampling and convolutional stride=2 subsampling, and concatenates them

AlexeyAB / darknet

Some questions #2144