Open rezaBarzgar opened 7 months ago
I highly recommend to set a debbug point on loss = svi.step(x_train, y_train)
and step into the svi.step
to observe the calculation of the loss value and the learning procedure
import time
import matplotlib as plt
import json
import matplotlib.pyplot as plt
import numpy as np
import re
import pyro
import torch
from torch import optim
from torch.utils.data import DataLoader
from torch.optim.lr_scheduler import StepLR, ReduceLROnPlateau
import torch.nn as nn
import torch.nn.functional as F
from torch.nn.functional import leaky_relu
from torch.distributions import Normal
import pyro
import pyro.distributions as dist
from pyro.nn import PyroModule, PyroSample
import torch.nn as nn
from pyro.infer import MCMC, NUTS
from pyro.infer import Predictive
from pyro.infer import SVI, Trace_ELBO
from pyro.infer.autoguide import AutoDiagonalNormal
from tqdm.auto import trange
class BNN(PyroModule):
def __init__(self, in_dim=1, out_dim=1, hid_dim=10, n_hid_layers=5, prior_scale=5.):
super().__init__()
self.activation = nn.Tanh() # could also be ReLU or LeakyReLU
assert in_dim > 0 and out_dim > 0 and hid_dim > 0 and n_hid_layers > 0 # make sure the dimensions are valid
# Define the layer sizes and the PyroModule layer list
self.layer_sizes = [in_dim] + n_hid_layers * [hid_dim] + [out_dim]
layer_list = [PyroModule[nn.Linear](self.layer_sizes[idx - 1], self.layer_sizes[idx]) for idx in
range(1, len(self.layer_sizes))]
self.layers = PyroModule[torch.nn.ModuleList](layer_list)
for layer_idx, layer in enumerate(self.layers):
layer.weight = PyroSample(dist.Normal(0., prior_scale * np.sqrt(2 / self.layer_sizes[layer_idx])).expand(
[self.layer_sizes[layer_idx + 1], self.layer_sizes[layer_idx]]).to_event(2))
layer.bias = PyroSample(dist.Normal(0., prior_scale).expand([self.layer_sizes[layer_idx + 1]]).to_event(1))
def forward(self, x, y=None):
x = x.reshape(-1, 1)
x = self.activation(self.layers[0](x)) # input --> hidden
for layer in self.layers[1:-1]:
x = self.activation(layer(x)) # hidden --> hidden
mu = self.layers[-1](x).squeeze() # hidden --> output
sigma = pyro.sample("sigma", dist.Gamma(.5, 1)) # infer the response noise
with pyro.plate("data", x.shape[0]):
obs = pyro.sample("obs", dist.Normal(mu, sigma * sigma), obs=y)
return mu
if __name__ == '__main__':
# Set random seed for reproducibility
np.random.seed(42)
# Generate data
x_obs = np.hstack([np.linspace(-0.2, 0.2, 500), np.linspace(0.6, 1, 500)])
noise = 0.02 * np.random.randn(x_obs.shape[0])
y_obs = x_obs + 0.3 * np.sin(2 * np.pi * (x_obs + noise)) + 0.3 * np.sin(4 * np.pi * (x_obs + noise)) + noise
x_true = np.linspace(-0.5, 1.5, 1000)
y_true = x_true + 0.3 * np.sin(2 * np.pi * x_true) + 0.3 * np.sin(4 * np.pi * x_true)
# Convert data to PyTorch tensors
x_train = torch.from_numpy(x_obs).float()
y_train = torch.from_numpy(y_obs).float()
xlims = [-0.5, 1.5]
ylims = [-1.5, 2.5]
pyro.clear_param_store()
model = BNN(hid_dim=10, n_hid_layers=1, prior_scale=5.)
mean_field_guide = AutoDiagonalNormal(model)
optimizer = pyro.optim.Adam({"lr": 0.01})
svi = SVI(model, mean_field_guide, optimizer, loss=Trace_ELBO())
pyro.clear_param_store()
num_epochs = 250
progress_bar = trange(num_epochs)
for epoch in progress_bar:
loss = svi.step(x_train, y_train)
progress_bar.set_postfix(loss=f"{loss / x_train.shape[0]:.3f}")
predictive = Predictive(model, guide=mean_field_guide, num_samples=500)
x_test = torch.linspace(xlims[0], xlims[1], 3000)
preds = predictive(x_test)
the following snippets are part of a larger piece of code related to probabilistic programming and variational inference in Pyro. The code is implementing the calculation of the Evidence Lower Bound (ELBO) and the surrogate ELBO for a given model and guide.
Here's a breakdown:
Loop over model nodes:
for name, site in model_trace.nodes.items():
if site["type"] == "sample":
# Accumulate log probabilities for each sample site in the model
elbo_particle = elbo_particle + torch_item(site["log_prob_sum"])
surrogate_elbo_particle = surrogate_elbo_particle + site["log_prob_sum"]
This loop iterates over the nodes in the trace (model_trace
) of the probabilistic model. For each node, if it corresponds to a sample statement in the model (i.e., generating a random variable), it accumulates the log probability of that sample (log_prob_sum
) to both elbo_particle
and surrogate_elbo_particle
.
Loop over guide nodes:
for name, site in guide_trace.nodes.items():
if site["type"] == "sample":
# Accumulate log probabilities for each sample site in the guide
elbo_particle = elbo_particle - torch_item(site["log_prob_sum"])
# If entropy term is non-zero, subtract it from surrogate ELBO
if not is_identically_zero(entropy_term):
surrogate_elbo_particle = surrogate_elbo_particle - entropy_term.sum()
# If score function term is non-zero, add it to surrogate ELBO
if not is_identically_zero(score_function_term):
if log_r is None:
log_r = _compute_log_r(model_trace, guide_trace)
site = log_r.sum_to(site["cond_indep_stack"])
surrogate_elbo_particle = surrogate_elbo_particle + (site * score_function_term).sum()
This loop is similar to the previous one but iterates over the nodes in the trace (guide_trace
) of the guide (variational distribution). It adjusts the elbo_particle
and surrogate_elbo_particle
based on the log probabilities, entropy terms, and score function terms associated with each sample site in the guide.
Return the negative ELBO and surrogate ELBO:
return -elbo_particle, -surrogate_elbo_particle
The final result is the negative of the accumulated ELBO and surrogate ELBO for the given model and guide. This form is common in variational inference, where the goal is often to maximize the ELBO, but optimization algorithms typically minimize the objective function.
I'm going to log my activities regarding probabilistic programming languages on this issue page. This will lead to a new BNN model using a more powerful Python library, i.e. Pyro. I'll also put some useful links here:
https://pyro.ai/examples/
https://youtu.be/tw0cSm7TElE?si=6get0ucfuV0mAN2y
Agenda
[x] Variatinal Inference in BNNs
[x] Finding Probabilistic libraries for Python (Pyro)
[x] What are
pyro.Sample
,pyro.dist
,pyro.param
,pyro.plate
?[x] Implementing a Bayesian regression classifier
[x] Implement a simple BNN.
[ ] Stochastic Variational Inference (SVI) {in progress}
[ ]
SVI
in Pyro {in progress}all of the learning process happens in
svi.step(model, guide,model, args, *kwargs)
[ ] Leaerning how ELBO works {in progress}
By debugging codes, I have found that in practice, the loss value is calculated by subtracting the sum of weights and biases of all nodes of
guide
from the sum of weights and biases of all nodes of the bnn model. The interesting part is that it ignores the prediction of the model.[ ] Pair Programming with Dr. Fani
[ ] Implement it on OpeNTF
[ ] Find a way to add CL (needs a unique Idea but can be a great research milestone)