MergeLoss with regular item is \log_{sigma} in paper but \log_{sigma}^2 in code

yaringal / multi-task-learning-example

A multi-task learning example for the paper https://arxiv.org/abs/1705.07115

MIT License

838 stars 205 forks source link

MergeLoss with regular item is \log_{sigma} in paper but \log_{sigma}^2 in code #4

Open songzeballboy opened 5 years ago

JadTawil-theonly commented 4 years ago

I see that as well, should it not be ( precision**2 )/ 2 instead of just precision?

antgr commented 4 years ago

So, what is the correct? Did you try them?

knighthappy commented 4 years ago

if ‘precision = K.exp(-log_var[0])’，then the network learning $\log{\sigma}^2$,'precision * (y_true - y_pred)*2. + log_var[0]' is $\frac{1}{\sigma ^ 2} L(w) + 2 \log{\sigma}$;

if ‘precision = K.exp(-log_var[0]) * 2 / 2’，then the network learning $\log{\sigma}$,'precision (y_true - y_pred)*2. + log_var[0]' is $\frac{1}{2 \sigma ^ 2} L(w) + \log{\sigma}$;

The difference between the two is the coefficient 2,for network training ,they are the same.