pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.82k stars 21.33k forks source link

Image training with fixed size #2522

Open beasnap opened 2 years ago

beasnap commented 2 years ago

Hello everyone!

I have a question.

I have images fixed at 640x512. What is the best cfg configuration for training these images? I am thinking about width, height, hue, jitter and random in cfg.

Problem1 width=640 height=640 or width=640 height=512 Problem2 random=1 or random=0 Problem3 hue=1.0 or hue=0 or hue=(other values) Problem4 jitter=.3 or jitter=(other values)

Please tell us what you think and why. thank you.

Kjelldor commented 2 years ago

Usually all of these parameters are highly dependent on your own particular usecase. Model size, jitter, hue and randomization are all options that can help you get a better accuracy if your dataset responds well to those processing steps.

So, in order to help you find the answers you need, I've formulated some questions you need to ask yourself.

Problem1: I would recommend keeping the model even (e.g. 640x640, 512x512, etc). In my personal testing this has usually given a better result. However, depending on your hardware running a 640x640 model might mean that you have to increase the subdivisions of your model in order to keep it functional (especially considering VRAM). If that's the case, I would err on the side of a 512x512 model with less subdivisions. A larger model is usually better, but so is having less subdivions. In my tests it seemed going below 64 (batch) / 12 (subdivisions) = 5 files per batch drastically decreases accuracy for object detection in general.

Problem 2: Considering your dataset contains fixed images, the question you need to ask yourself here mostly has to do with perspective of the image. Is the object you want to detect always the same size as well? Then don't bother with random. It will just slow down training without really giving you anything tangible to show for it. If the object does change size, using random can help with detecting outliers in that regard.

Problem 3: Haven't had much experience myself with hue settings, but considering that the images are thermal, some hue randomization seems wise. How much is something you'll have to find out yourself.

Problem 4: Considering that your aspect ratio is always fixed, I wouldn't bother using jitter at all. It will create slower training, because it will randomize the aspect ratio's of the inputs, but that will never help finding outliers in aspect ratio's, since your ratio's are always fixed.

If you really want the best settings for your model, you'll still have to experiment quite a lot, but I hope this helps a bit.