Available loss functions in this repo

spaul13 commented 4 years ago

can anyone please tell me what are the available loss functions we can use during training with darknet repo and how to change the loss functions (specifically from cross-entropy to mse loss)

WongKinYiu commented 4 years ago

smooth l1
l1
l2
softmax with cross entropy
logistic with cross entropy
giou
diou
ciou ...

spaul13 commented 4 years ago

@WongKinYiu thanks a lot for the reply. How can I choose l2 loss during training using this repo? Can u plz tell me where I have to change?

WongKinYiu commented 4 years ago

for example, if you want to change softmax with cross entropy loss of softmax layer, do following:

replace softmax_x_ent_cpu in https://github.com/AlexeyAB/darknet/blob/master/src/softmax_layer.c#L72 to l2_cpu.

replace softmax_x_ent_gpu in https://github.com/AlexeyAB/darknet/blob/master/src/softmax_layer.c#L110 to l2_gpu.

if you do not want let your features path through softmax layer. modify your cfg file, replace:

[softmax]
groups=1

to

[cost]
type=sse

spaul13 commented 4 years ago

@WongKinYiu thanks a lot for resolving my problem. I just have one doubt regarding what is the difference between these two ways. In the first one, I am changing the loss function inside the softmax layer and in the second one, features won't get into softmax layers. Can u plz explain how the loss calculations will be different in these two scenarios? Thanks in advance

WongKinYiu commented 4 years ago

first case: prediction = softmax(wx) second case: prediction = wx

loss = l2(prediction, truth)

spaul13 commented 4 years ago

@WongKinYiu thanks a lot for resolving these doubts. One last question, In the output layer of cfg file I have 5 1X1 convolution filters for classifying 5 classes. For a single image, it will output an array of size 5 each entry showing relatedness (between 0-1) for those 5 classes.

For cross-entropy loss, it uses −(ylog(p)+(1−y)log(1−p)) for each class p-value obtained from the output array.

I just want to know how the MSE loss will be computed for a 5 class classification. Is it (y-y')^2 for a single image? how y' will be computed for a multi-class classification scenario while using MSE loss.

WongKinYiu commented 4 years ago

to mse loss, all of single-class, multi-class, single-label, and multi-label are same.

spaul13 commented 4 years ago

@WongKinYiu can u please tell me what's the equation for the softmax with crossentropy loss (implemented in darknet) for a multiclass classification?

WongKinYiu commented 4 years ago

https://github.com/AlexeyAB/darknet/blob/master/src/blas.c#L414-L423 https://github.com/AlexeyAB/darknet/blob/master/src/blas_kernels.cu#L1203-L1212

spaul13 commented 4 years ago

Thanks a lot, @WongKinYiu. Your replies are helping me a lot. About the loss, I have one question. As my dataset is imbalanced I want to use focal loss during training my classification model. Can u please tell me how to include that(i.e., focal loss) in the .cfg file?

WongKinYiu commented 4 years ago

currently this function only support detector training. for focal loss, you can take a look https://github.com/AlexeyAB/darknet/blob/master/src/yolo_layer.c#L274-L292 for counter per class, you can take a look https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L372-L393 and add the function you need into classifier.c

spaul13 commented 4 years ago

@WongKinYiu thanks a lot for the reply. can u please tell me how to add the counters_per_class in the .cfg file? where to add.

my last layer is [softmax] groups=1

Here I am attaching my cfg file. If u can please tell me where I can possibly add counters_per_class in my .cfg file, it would be a great help.

WongKinYiu commented 4 years ago

you should modify the code of loss function and softmax layer to support it.

also, i think parser and classifier need some modification.

spaul13 commented 4 years ago

@WongKinYiu I have been training a mobilenet model on my custom dataset. I saved the model after 5K iterations. Then, I have added some new images with labels in my custom dataset. Can I restart my mobilenet model training from the last saved point? _darknet.exe classifier train data\mobilenetv2.data cfg\mobilenetv2.cfg backup_exp_mobilenetv2\mobilenetv2_last.weights -dontshow

@WongKinYiu can u please tell me whether the training will work or not if I add images to my dataset?

WongKinYiu commented 4 years ago

yes, u can. but u should modify the load_weights_upto function to make sure the bias and weights is loaded to correct classes. https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L1984

spaul13 commented 4 years ago

@WongKinYiu thanks a lot for the reply. The problem I just asked is it similar to transfer learning?

And one last question, if I increase the batch size after some training iterations and then restart the training from the last saved point. will the training work in that scenario? As through training, we are trying to reach where the loss function is minimum will batch size increment can affect that adversely?

WongKinYiu commented 4 years ago

yes. just to make sure, your problem is original dataset contains classes a, b, c. and your custom dataset contains classes a, b, c, d, e. you want to continue training with the checkpoint. the darknet do save_bias, then do save_weights. so original weights file format is: 3 bias, 3 weights. if you want to train with 5 classes with checkpoint, you should modify the load weights function to do: load 3 bias, skip 2 bias, load 3 weights, and skip 2 weights.

no. the effect of increasing batch size is similar to decrease learning rate.

akashAD98 commented 3 years ago

@WongKinYiu @AlexeyAB where is the code for Focal loss? & how can I tweak or change the (alpha value of focal loss ) I know we can add focal loss in cfg using focal_loss=1 but how can we add alpha value for focal loss i.e alpha=0,0.1,0.5,1,2,5

hope you can help me .Thanks in advance

akashAD98 commented 3 years ago

https://github.com/AlexeyAB/darknet/blob/53160fa6662ea7e8c4095588db56cac7b339b917/src/yolo_layer.c#L192

is it correct? should I need to change this line to play with different alpha values?? @WongKinYiu

WongKinYiu commented 3 years ago

yes.

AlexeyAB / darknet

Available loss functions in this repo #5184