Closed leyuan-sun closed 2 years ago
I study more about your variance, it is a constant value after training. What I want to talk about is the variance of each pose which is not a constant value just like the reference paper I mentioned above.
@rginjapan I think you are mixing two types of variances, the one in the paper the so-called "homoscedastic loss" is explained in more detail here. The one you want is the variance about the uncertainity of the inferences (network predictions), more info about this topic can be found here here.
Regards Arash
self.sx = torch.nn.Parameter(torch.tensor(sx, device=device, requires_grad=learn_hyper_params)) self.sq = torch.nn.Parameter(torch.tensor(sq, device=device, requires_grad=learn_hyper_params))
In the paper, you said these two parameters do not need manually set, but I found the initial value and I did not understand how the network learning these variance. Since in common case, people usually use another decoder to learn the variance like the paper "Unsupervised Balanced Covariance Learning for Visual-Inertial Sensor Fusion" For easier understanding, the network can not only predecit pose but also the variance, so we need two decoders, one is for pose regression and the other is for variance. The same idea in "What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?". (Similar issue https://github.com/hmi88/what/issues/1)
So how could your network learn these two variance without another decoders?