Closed jrubin01 closed 5 years ago
Yes it is due to a different sampling technique applied. But please visit in 2-3 days and new updated code will be pushed.
FWIW, this PR on the mirror repo fixes the NaN issue (on my local machine atleast) on main_Bayes.py
. - https://github.com/felix-laumann/Bayesian_CNN/pull/7 (except replace math.log
with torch.log
)
The argument expected by logpdf
and sample
in the Normal distribtuion class is logvar, but the actual value passed is std_dev.
Is the issue already solved? I also have the problem of getting Loss=nan, by executing the main_Bayes.py script. I also run into the problem of CUDA error: out of memory. Does anyone else also have the same issue? Thanks!
I believe I am also having the issue with nan loss. What versions of Torch and the other dependencies did you use in the development? Would it be possible to get something like a pip freeze or a conda list listing the versions that are guaranteed to work? Thank you in advance.
Due to conference proceedings and me being out for a couple of days have delayed things a bit. I am sorry but I will update the code with requirements files in 2 weeks. With new updates on Uncertainty measures. Thank you.
Got the same nans. I think that the problem is in weights sampling: weight = self.weight.sample() in fcprobforward. I found out that logvar in sample() method of Normal class in BBBdistributions.py becames too big to store in memory. I managed to get rid of nans by setting q_logvar_init in BBBLinearFactorial class to a negative value (-5). It helped to reduce the amount of variance in fc_qw_std which is set in self.fc_qwstd.data.fill(self.q_logvar_init). Not sure for now if it is a correct to do.
Think i have met the problem that the kl divergence come out from main_bayes.py is nan.
I think the nan problem is fixed. I did not verify it though fully.
Hey, is the nan loss problem fixed? Saw the note in the readme, so just wondering if the current code is correct.
Code is fixed and up and running. Sorry for the delay.
Running both main_Bayes.py and Bayesian_CNN_Detailed.ipynb in the 'Image Recognition' folder give losses that result in nan, due to the kl returned from net.probforward(x).