Closed poolio closed 8 years ago
Thank you for the interest in the code. I believe that this is due to the batch normalisation not being implemented correctly at this stage (not included in the article either). Have you tried without batch normalisation? Batch normalisation needs to be weighted correctly in the unlabelled case. - it's on my todo list.
Convergence is much slower without batchnorm, but that fixed the issue with nans. Thanks!
Thanks for putting this code up! Running the
run_mnist.py
script without any modifications I get nans at epoch 133. Here's the relevant output:Is this a known issue? Do I need to tweak the learning rates? Thanks!