Mazin-Hnewa / MS-DAYOLO

Multiscale Domain Adaptive YOLO for Cross-Domain Object Detection
MIT License
61 stars 6 forks source link

Loss function for domain classifier different in the implementation #9

Open supersodic opened 1 year ago

supersodic commented 1 year ago

Hello,

I have a question regarding the loss function in dc_layer.c. Why do you use l.delta[i]=(l.d_truth[i]-l.output[i])/size; instead of l.delta[i]=(l.d_truth[i]log(l.output[i]))+((1-l.d_truth[i])log(1-l.output[i]))? Is this an approximation, or what is the reasoning behind this chosen l.delta for the domain classifier?

Thank you in advance for your answer!

Mazin-Hnewa commented 1 year ago

I.delta is used to compute the gradient of the loss that passes to the previous layers, not the loss itself. Please check this file that explains how to calculate the gradient of binary cross-entropy loss with logistic activation : https://www.ics.uci.edu/~pjsadows/notes.pdf Also, please note we flip the sign to minimize the loss.