Config Params (which architecture and backend?)

THuffam commented 4 years ago

Hi

I am wanting to create a detector type model. I have a dataset with 1000 images and created the annotation files for them all.

Within the dataset there around around 12 different labels. I intend to run this on a k210 board.

Which architecture should I use (I noticed in your example configs you use both MobileNet7_5 and TinyYolo)? Also do I need to use a 'backend' - I noticed when you use MobileNet7_5 you specify a backend of imagenet, but when you use TinyYolo the backend parameter is empty.

Also, I have been trying to understand anchors. I read the post you recommended (darknet..) but it doensnt make a lot of sense - surely it should use the boxes specified in the training annotation files?? Also in the article (and other articles I've read online) they only mention anchors as sets of x/y coordinates that define rectangles - yet in your config files they are just a series of single numbers - how do these relate to rectangles?

Many thanks Tim

AIWintermuteAI commented 4 years ago

Hello, Tim! 1000 samples for 12 labels might work - watch for overfitting though. If you see some of the labels have low mAP on evaluation, then you can try adding samples to these categories.

Also do I need to use a 'backend' - I noticed when you use MobileNet7_5 you specify a backend of imagenet, but when you use TinyYolo the backend parameter is empty.

You probably are referring to "backend_weights" parameter - it determines what weights do you want to load to the feature extractor(i.e. the "backend" of the network). There are no imagenet weights available for TinyYolo as of now, this is the reason why with TinyYolo backend_weights parameter is empty. One advanced technique that showed the improvements for some of my projects was to train the classifier first, and then use feature extractor from that classifier as backend weights for detector. For example your end goal is to train dog detector - then you could train a cat-dog classifier first(with "save_bottleneck_weights" parameter enabled) and then use bottleneck_weights.h5 file as backend_weights for when training a dog detector. This is not necessary though, just an advanced training technique.

surely it should use the boxes specified in the training annotation files??

what would use boxes specified in the training annotation files?

only mention anchors as sets of x/y coordinates that define rectangles - yet in your config files they are just a series of single numbers - how do these relate to rectangles?

There is a lot of confusion about anchors for people starting out with object detectors. Let me do a little compilation here from different explanations about anchors: Anchor boxes are the average aspect ratios (width/height) of objects in your dataset. There are 5 because different objects with have different common ratios. A person is tall and thin while a train is much wider than it is tall. Think of anchor boxes as initial approximations for bounding box regression. It's nice to have better initial guesses, but the default ones provided should do just fine in most situations. [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828] These are default anchors used (obtained from using K-means algorithm on bounding boxes from PASCAL-VOC), there are 5 pairs of anchors, first number is the width and second is the height of the anchor, relative to the cell size. 0.57273, 0.677385 - anchor 1 1.87446, 2.06253 - anchor 2 and so on. canvas This picture shows a bounding box and the anchor corresponding to it overlayed over cell grid. Some more reading about anchors in the original keras-yolo2 implementation: https://github.com/experiencor/keras-yolo2/issues/10 https://github.com/experiencor/keras-yolo2/issues/87 https://github.com/experiencor/keras-yolo2/issues/135 Cheers!

THuffam commented 4 years ago

Hi Thanks for your response.

Yeah, out of the 1000 images, I would say 90% are for the one class (shark) I am really interested in. Not sure if this will skew the results - should I remove all the others I am not interested in? My goal is to identify if a shark is in the water ( will use a drone to take a photo very second or so). But I do not want it to return a false positive for something like a swimmer, surfer or a dolphin etc.

Yes it was the backend weights I was referring to. I will test both MobileNet7_5 and TinyYolo.

Thanks for explaining the anchors parameter - I didn't realise it was a series of width/height pairs. Oh, I was asking why the software uses predefined anchors when surely it would be more accurate to look at the actual training data (the boxes specified in the annotation files) to determine the average/common box sizes and ratios?? If not would it be worth writing a program that does this and outputs the anchors so I could copy& paste them into the config file?

Thanks again Tim

AIWintermuteAI commented 4 years ago

Yes, if you're interested in detecting sharks, then the only class you need is "shark" :) To decrease the false positive rate you can put some pictures with swimmers, surfers and dolphins, but do not label these objects - e.g. the picture that has dolphin in the water, but the annotation file for it is empty - because there is no shark. TinyYolo will most likely won't work, unless you provide the backend_weights for it - 1000 pictures is not enough to train from scratch. MobileNet7_5 is a better choice, since you can choose imagenet weights. The anchors don't matter that much - they can speed up the training, but won't make a difference if the data is bad for example. There is a script for automatic anchor generation here https://github.com/experiencor/keras-yolo2/blob/master/gen_anchors.py

THuffam commented 4 years ago

Fantastic! - That's very helpful - thank you!

MobileNet it is then.

Could I get away with just not specifying the extra labels in the config (while keeping my existing annotation files) - or do I need to go and change/recreate the annotation files?

Cheers Tim

AIWintermuteAI commented 4 years ago

As of now you need to specify all the labels present in the dataset - otherwise it will throw an error. It is a feature on my (long) TO-DO list, make the code ignore labels not present in config - would be nice to have, that would allow you to train for example dog detector from PASCAL-VOC dataset easily by just leaving out all the rest of the classes. Sadly I'm completely swamped this summer with freelance work and only have time for bug fixes and support :( Hopefully by the end of the summer I'll find the time to add more features.

THuffam commented 4 years ago

Thats absolutely fine... just thought i should ask. Many thanks again for all your hard work. Cheers Tim

AIWintermuteAI / aXeleRate

Config Params (which architecture and backend?) #15