team-approx-bayes / dl-with-bayes

Contains code for the NeurIPS 2019 paper "Practical Deep Learning with Bayesian Principles"
241 stars 23 forks source link

AttributeError: 'NoneType' object has no attribute 'data' #2

Open jrodriguezpuigvert opened 4 years ago

jrodriguezpuigvert commented 4 years ago

Hello, I am experimenting with VOGN and I am getting this error:

  File "/home/thom/anaconda3/envs/TUT_3/lib/python3.7/site-packages/torchsso-0.1.1-py3.7.egg/torchsso/optim/vi.py", line 215, in step
    grads = [p.grad.data for p in params]
  File "/home/thom/anaconda3/envs/TUT_3/lib/python3.7/site-packages/torchsso-0.1.1-py3.7.egg/torchsso/optim/vi.py", line 215, in <listcomp>
    grads = [p.grad.data for p in params]
AttributeError: 'NoneType' object has no attribute 'data'

Debugging I found that sometimes p.grad is None here grads = [p.grad.data for p in params]

I didn't freeze any layer, I am learning from scratch.

Probably there is an error of my configuration. Do you have any hints about what could be the problem?

Here is the initialization of the optimizer:

model_optimizer = torchsso.optim.VIOptimizer(self.decoder,
                                                          dataset_size=300000,
                                                          num_mc_samples=10,
                                                          val_num_mc_samples=0,
                                                          lr=1e-4,
                                                          curv_type="Cov",
                                                          curv_shapes={
                                                              'Linear': 'Diag',
                                                              'Conv2d': 'Diag',
                                                              'BatchNorm1d': 'Diag',
                                                              'BatchNorm2d': 'Diag'
                                                          }, grad_ema_decay=0.1,
                                                          grad_ema_type="raw",
                                                          kl_weighting=1,
                                                          init_precision=8e-3,
                                                          prior_variance=1,
                                                          acc_steps=1,
                                                          curv_kwargs={
                                                              "damping": 0,
                                                              "ema_decay": 0.001
                                                          })
kazukiosawa commented 4 years ago

Thanks for sharing the error, and sorry for the late response. Could you share (the part of) your training script so that I can reproduce your error?

As the p.grad is None according to your error message, I guess there's something wrong with the definition of the closure.

jrodriguezpuigvert commented 4 years ago

I found already the bug. In the forward function, you must use all parameters of the model otherwise some p.grad will be None after calling backward. That could be a problem if anyone would like to use it for fine-tunning. Other optimizers like Adam are more flexible in that case.

emtiyaz commented 4 years ago

Wonderful! Many thanks. Kazuki can fix this issue then and close it.