How to train the self-teaching framework, an extended question

kishore-greddy commented 3 years ago

I was reading your paper "On the uncertainty of self-supervised monocular depth estimation". Firstly, I would like to congratulate you on writing such an useful and good paper. Not much research has been done on the Uncertainty topic, and you have done it. Thank you for that. I have a question regarding the loss function in the Self Teaching approach. Since the training code is not provided, I wanted to try it out myself as I wanted to reproduce your results for my project. If my understanding is correct, I first use the pretrained model to generate pseudo ground truths, and then use them to train the Student Network. The student network has two channels as outputs, one is disparity and the other is the uncertainty. So, in the loss function ,

1) Does mean the disparity of the Student network and mean the uncertainty generated at the second output channel. 2) Does the L1 difference operator take pixel wise difference between Teacher and the Student and finally the sum of all the pixel-wise losses are taken to minimize? Please let me know. I am stuck at this point and it would be really helpful if you could shed some light on it.

Edit : I have already seen the issue #2 , there was no real mention of Loss there. This question might be a simple one, but I am no expert in this field, I have just started working on this. Excuse any ignorance. Thanks.

mattpoggi commented 3 years ago

Hi,

\mu and \sigma are respectively the two output channels, that assumes the meaning of estimated mean and estimated variance for the depth distribution d_s predicted by the student
yes, the L1 loss is computed pixel-wise. Then, you divide (still pixel-wise) by the variance and sum the log term to avoid infinite variance. Finally, you compute the average over all pixels Hope this helps

kishore-greddy commented 3 years ago

Hi @mattpoggi , Thanks for the explanation. It sure helps 👍

mattpoggi / mono-uncertainty

How to train the self-teaching framework, an extended question #9