paraschopra / bayesian-neural-network-mnist

Bayesian neural network using Pyro and PyTorch on MNIST dataset
https://towardsdatascience.com/making-your-neural-network-say-i-dont-know-bayesian-nns-using-pyro-and-pytorch-b1c24e6ab8cd
307 stars 69 forks source link

Critical Error with Updated Dependencies #6

Open jbhatch opened 3 years ago

jbhatch commented 3 years ago

Using updated versions of torch, torchvision, and pyro dependencies, an error (below) occurs in the instance of SVI where the event_dims between the model and guide disagree at site 'module$$$out.weight': 0 vs 1. Additionally, .independent() is deprecated and it is recommended that it be changed its replacement, .to_event(), in the following line of the guide: " outw_prior = Normal(loc=outw_mu_param, scale=outw_sigma_param).independent(1)". Unfortunately, changing from .independent(1) to .to_event(1) in the guide does not rectify the event_dims mismatch error.

Please help. I would very much like to be able to use your Bayesian Neural Network script as it deals with rejecting untrained classes in the test data according to probability.

P.S. On a side note regarding a previously discussed issue, the MNIST data is inaccessible using torchvision's datasets.MNIST as used in bnn.ipynb. I checked other image datasets such as Fashion-MNIST, and that was readily available. Currently, the MNIST data can be obtained with this command: "!wget www.di.ens.fr/~lelarge/MNIST.tar.gz".

Error Messages: /usr/local/lib/python3.7/dist-packages/pyro/primitives.py:451: FutureWarning: The random_module primitive is deprecated, and will be removed in a future release. Use pyro.nn.Module to create Bayesian modules from torch.nn.Module instances. "modules from torch.nn.Module instances.", FutureWarning)

ValueError Traceback (most recent call last)

in () 6 for batch_id, data in enumerate(train_loader): 7 # calculate the loss and take a gradient step ----> 8 loss += svi.step(data[0].view(-1,28*28), data[1]) 9 normalizer_train = len(train_loader.dataset) 10 total_epoch_loss_train = loss / normalizer_train /usr/local/lib/python3.7/dist-packages/pyro/infer/svi.py in step(self, *args, **kwargs) 126 # get loss and compute gradients 127 with poutine.trace(param_only=True) as param_capture: --> 128 loss = self.loss_and_grads(self.model, self.guide, *args, **kwargs) 129 130 params = set(site["value"].unconstrained() /usr/local/lib/python3.7/dist-packages/pyro/infer/trace_elbo.py in loss_and_grads(self, model, guide, *args, **kwargs) 129 loss = 0.0 130 # grab a trace from the generator --> 131 for model_trace, guide_trace in self._get_traces(model, guide, args, kwargs): 132 loss_particle, surrogate_loss_particle = self._differentiable_loss_particle(model_trace, guide_trace) 133 loss += loss_particle / self.num_particles /usr/local/lib/python3.7/dist-packages/pyro/infer/elbo.py in _get_traces(self, model, guide, args, kwargs) 168 else: 169 for i in range(self.num_particles): --> 170 yield self._get_trace(model, guide, args, kwargs) /usr/local/lib/python3.7/dist-packages/pyro/infer/trace_elbo.py in _get_trace(self, model, guide, args, kwargs) 56 """ 57 model_trace, guide_trace = get_importance_trace( ---> 58 "flat", self.max_plate_nesting, model, guide, args, kwargs) 59 if is_validation_enabled(): 60 check_if_enumerated(guide_trace) /usr/local/lib/python3.7/dist-packages/pyro/infer/enum.py in get_importance_trace(graph_type, max_plate_nesting, model, guide, args, kwargs, detach) 48 graph_type=graph_type).get_trace(*args, **kwargs) 49 if is_validation_enabled(): ---> 50 check_model_guide_match(model_trace, guide_trace, max_plate_nesting) 51 52 guide_trace = prune_subsample_sites(guide_trace) /usr/local/lib/python3.7/dist-packages/pyro/util.py in check_model_guide_match(model_trace, guide_trace, max_plate_nesting) 252 if model_site["fn"].event_dim != guide_site["fn"].event_dim: 253 raise ValueError("Model and guide event_dims disagree at site '{}': {} vs {}".format( --> 254 name, model_site["fn"].event_dim, guide_site["fn"].event_dim)) 255 256 if hasattr(model_site["fn"], "shape") and hasattr(guide_site["fn"], "shape"): ValueError: Model and guide event_dims disagree at site 'module$$$out.weight': 0 vs 1
paraschopra commented 3 years ago

Sorry, I'm no longer actively maintaining this. If you have a fix, please send.

tgeller08 commented 3 years ago

I encountered a similar issue and am working on a fix. For now, a temporary solution is to install an earlier version of pyro in the notebook or virtual environment. For example, run !pip3 install pyro-ppl==1.4.0 instead.

mehmetavnicelik commented 1 year ago

I encountered a similar issue and am working on a fix. For now, a temporary solution is to install an earlier version of pyro in the notebook or virtual environment. For example, run !pip3 install pyro-ppl==1.4.0 instead.

I tried this but got importing error, probably because of the version is deprecated. did you find any other solution? thanks in advance

Sitheral commented 11 months ago

I've figured out a fix for this. You might try modifying the model and guide function in the following manner def model(x_data, y_data):

fc1w_prior = Normal(loc=torch.zeros_like(net.fc1.weight), scale=torch.ones_like(net.fc1.weight)).to_event(2)
fc1b_prior = Normal(loc=torch.zeros_like(net.fc1.bias), scale=torch.ones_like(net.fc1.bias)).to_event(1)

outw_prior = Normal(loc=torch.zeros_like(net.out.weight), scale=torch.ones_like(net.out.weight)).to_event(2)
outb_prior = Normal(loc=torch.zeros_like(net.out.bias), scale=torch.ones_like(net.out.bias)).to_event(1)

priors = {'fc1.weight': fc1w_prior, 'fc1.bias': fc1b_prior,  'out.weight': outw_prior, 'out.bias': outb_prior}
# lift module parameters to random variables sampled from the priors
lifted_module = pyro.random_module("module", net, priors)
# sample a regressor (which also samples w and b)
lifted_reg_model = lifted_module()

lhat = log_softmax(lifted_reg_model(x_data))

pyro.sample("obs", Categorical(logits=lhat).to_event(1), obs=y_data)

def guide(x_data, y_data):

# First layer weight distribution priors
fc1w_mu = torch.randn_like(net.fc1.weight)
fc1w_sigma = torch.randn_like(net.fc1.weight)
fc1w_mu_param = pyro.param("fc1w_mu", fc1w_mu)
fc1w_sigma_param = softplus(pyro.param("fc1w_sigma", fc1w_sigma))
fc1w_prior = Normal(loc=fc1w_mu_param, scale=fc1w_sigma_param)
# First layer bias distribution priors
fc1b_mu = torch.randn_like(net.fc1.bias)
fc1b_sigma = torch.randn_like(net.fc1.bias)
fc1b_mu_param = pyro.param("fc1b_mu", fc1b_mu)
fc1b_sigma_param = softplus(pyro.param("fc1b_sigma", fc1b_sigma))
fc1b_prior = Normal(loc=fc1b_mu_param, scale=fc1b_sigma_param)
# Output layer weight distribution priors
outw_mu = torch.randn_like(net.out.weight)
outw_sigma = torch.randn_like(net.out.weight)
outw_mu_param = pyro.param("outw_mu", outw_mu)
outw_sigma_param = softplus(pyro.param("outw_sigma", outw_sigma))
outw_prior = Normal(loc=outw_mu_param, scale=outw_sigma_param)
# Output layer bias distribution priors
outb_mu = torch.randn_like(net.out.bias)
outb_sigma = torch.randn_like(net.out.bias)
outb_mu_param = pyro.param("outb_mu", outb_mu)
outb_sigma_param = softplus(pyro.param("outb_sigma", outb_sigma))
outb_prior = Normal(loc=outb_mu_param, scale=outb_sigma_param)
priors = {'fc1.weight': fc1w_prior.to_event(2), 'fc1.bias': fc1b_prior.to_event(1), 'out.weight': outw_prior.to_event(2), 'out.bias': outb_prior.to_event(1)}

lifted_module = pyro.random_module("module", net, priors)

return lifted_module()