pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.79k stars 21.33k forks source link

Can someone explain parameters in [net] of yolov3.cfg #1900

Open minyoungk99 opened 4 years ago

minyoungk99 commented 4 years ago

I put comments next to some that I know what it is. `[net]

Testing

batch=64 #batch size subdivisions=8 #? width=416 #input image width height=416 #input image height channels=3 #input channels, RGB momentum=0.9 #grad descent moentum? decay=0.0005 # decay factor in gradient over iter? angle=0 #? **#these 3 below are alternate ways to describe image instead of RGB, but are

how does YOLO know if input image channels are RGB or these?**

saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 #gradient descent learning rate burn_in=1000 # ? do we not to weight update for maybe first 1000 epochs in train? max_batches = 6000 # what name says policy=steps #? steps=4800,5400 #these are 80% and 90% of max_batches, but what is steps do? scales=.1,.1 #?`

Much thanks.

blackwool commented 4 years ago

subdivisions=8 # if batch is too big to load data into gpu memory , separate data to 8 parts and load data to gpu 8 times. angle=0 # image rotation angle policy=steps # learnging rate change mode , after 4800 , learning_rate=learning_ratescales(0) after 5400, learning_rate=learning_ratescales(1) scales=.1,.1 # learning rate change scale

AlexeyAB commented 4 years ago

CFG Parameters in the [net] section: https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section

volgachen commented 4 years ago

subdivisions=8 # if batch is too big to load data into gpu memory , separate data to 8 parts and load data to gpu 8 times. angle=0 # image rotation angle policy=steps # learnging rate change mode , after 4800 , learning_rate=learning_rate_scales(0) after 5400, learning_rate=learning_rate_scales(1) scales=.1,.1 # learning rate change scale

Can I assume that 'subdivisions' doesn't affect training results, but only influence GPU occupation?

AlexeyAB commented 4 years ago

Lower subdivisions - higher accuracy and speed

declspec commented 4 years ago

@AlexeyAB What would be a better configuration when GPU memory is constrained, higher batch and higher subdivisions or lower batch and lower subdivisions?

i.e. If I can get the training running with batch=48 and subdivisions=16 or batch=24 and subdivisions=8 without my GPU running out of memory, which would be preferable?

CdAB63 commented 4 years ago

@AlexeyAB : angle=0: If I set a value of angle, image is rotated randomly up to that value or image is rotated exactly that value? @AlexeyAB : now the -map option creates a plot where I have 3 lines: the loss value, the mAP value and then there's a line C (green) that is stuck at zero. What is this line? @AlexeyAB : if I use adversarial_lr, what's the tip to set a reasonable value for it given a value of learning rate (let's say 0.001)?

AlexeyAB commented 4 years ago

@CdAB63 angle=0 - it can be applied only for ./darknet detector classifier, not for yolo currently C - is a contrastive-learning accuracy for object tracking (it isn't used by default) adversarial_lr=1.0 - use higher value for the larger model and the smaller dataset (don't use it for tiny models)

joanna28-web commented 4 years ago

@AlexeyAB Why are lower subdivision higher accuracy? I don't understand what the subdisvions mean for the traning. In general, what I understood is that "batch" in configuration files is the mini-batch size and the number of pictures after which the weights are updated. However, what you say here, that subdivisions influence the accuracy, and when in another issue you say that "minibatch=batch/subdivisions" is making me confused. Is there any chance you could explain? If in the end the mini-batch are batch/subdivisions, why do we have at all the "batch" and the "subdivisions" and not just "mini-batch"?

AlexeyAB commented 4 years ago

mini_batch = batch / subdivisions

joanna28-web commented 4 years ago

Thank you for your quick answer @AlexeyAB I think I am probably lacking the knowledge here, but could you explain what is the point of having both "batch" and "subdivisions" parameters? Why is it recommended to set the batch to 64? (I tried setting batch to 32 and subdivisions to 4 assuming that it would be the same as setting batch to 64 and subdivisions to 8, but this message appeared when training: " You set batch=32 lower than 64! It is recommended to set batch=64 subdivision=64").

ayazwani commented 3 years ago

can i train with channels = 1 and what should then i keep values for hue and saturation also exposure

Chitti21 commented 3 years ago

@AlexeyAB can you please suggest, When to use random =0 or 1. Is it necessary to use random = 1 , even after using mosaic augmentation?

RajathAV14 commented 3 years ago

@joanna28-web , earlier you asked why both batch and subdivision parameters are set (and not just one of them). Do you know the reason now?

rishishounak commented 3 years ago

@AlexeyAB @RajathAV14 @daokouer I am unable to start training for just 100 images are a 1000 images required minimum for training. Is there a file that I neet to change if I want to train on only 100 images

rishishounak commented 3 years ago

@joanna28-web can you answer how to train with only 100 images. I am facing issue. You seem to have tried the training yourself.