Closed ZohrehAdabi closed 3 years ago
Hi, so the MLL shouldn't be returning a scalar -- the trick with the Dirichlet likelihood is to model the outputs as num_classes
outputs so the MLL should return a vector of size num_classes
.
What version of pytorch and gpytorch are you using?
I wasn't able to reproduce your shaping based error once I passed in the inputs correctly, although I did find a small bug (#1728) while trying to reproduce it.
Hi, so the MLL shouldn't be returning a scalar -- the trick with the Dirichlet likelihood is to model the outputs as
num_classes
outputs so the MLL should return a vector of sizenum_classes
.What version of pytorch and gpytorch are you using?
I wasn't able to reproduce your shaping based error once I passed in the inputs correctly, although I did find a small bug (#1728) while trying to reproduce it.
Thank you for your reply, I use gpytorch 1.5.0 and pythorch 1.8.1. It is right to sum the loss, isn't it? here is the error just in test time.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\ADABI\anaconda\anaconda3\lib\site-packages\gpytorch\models\exact_gp.py", line 322, in __call__
predictive_mean = predictive_mean.view(*batch_shape, *test_shape).contiguous()
RuntimeError: shape '[100]' is invalid for input of size 200
Yes, you should sum the loss as shown in the tutorial.
I passed in the inputs correctly
Hi, so the MLL shouldn't be returning a scalar -- the trick with the Dirichlet likelihood is to model the outputs as
num_classes
outputs so the MLL should return a vector of sizenum_classes
.What version of pytorch and gpytorch are you using?
I wasn't able to reproduce your shaping based error once I passed in the inputs correctly, although I did find a small bug (#1728) while trying to reproduce it.
What is your mean by " I passed in the inputs correctly"? I tried to reshape the input but still getting shape error.
Yes, so the transformation in the DirichletClassificationLikelihood
produces pseudo-targets so to speak (in the tutorial, these are likelihood.transformed_targets
--- you will want to use them to pass as input to the model and to train with as well).
Additionally, you end up needing to pass the number of classes into at least the mean module. So code like this runs fine for me:
class ExactGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood, kernel='rbf', inducing_points=None):
super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean(batch_shape=torch.Size((2,)))
## RBF kernel
if(kernel=='rbf' or kernel=='RBF'):
# self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
self.base_covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
self.covar_module = gpytorch.kernels.InducingPointKernel(
self.base_covar_module, inducing_points=inducing_points , likelihood=likelihood
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
# Initialize model and likelihood
inducing_point = train_x[:10]
train_y = torch.round(train_y).long()
likelihood = gpytorch.likelihoods.DirichletClassificationLikelihood(targets=train_y, learn_additional_noise=False)
# NOTE THE TRANSFORM HERE
model = ExactGPModel(train_x, likelihood.transformed_targets, likelihood, 'rbf', inducing_point)
training_iterations = 200
# Find optimal model hyperparameters
model.train()
likelihood.train()
# Use the adam optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
for i in range(training_iterations):
# Zero backpropped gradients from previous iteration
optimizer.zero_grad()
# Get predictive output
output = model(train_x)
# Calc loss and backprop gradients
# THE responses we train with are the transformed_targets here.
loss = -mll(output, likelihood.transformed_targets).sum()
loss.backward()
# print('Iter %d/%d - Loss: %.3f' % (i + 1, training_iterations, loss.item()))
if (i+1)%50==0:
print(f'Iter {i + 1:02}/{training_iterations} - Loss: {loss.item():.4f}')
optimizer.step()
Yes, so the transformation in the
DirichletClassificationLikelihood
produces pseudo-targets, so to speak (in the tutorial, these arelikelihood.transformed_targets
--- you will want to use them to pass as input to the model and to train with as well).Additionally, you end up needing to pass the number of classes into at least the mean module. So code like this runs fine for me:
class ExactGPModel(gpytorch.models.ExactGP): def __init__(self, train_x, train_y, likelihood, kernel='rbf', inducing_points=None): super(ExactGPModel, self).__init__(train_x, train_y, likelihood) self.mean_module = gpytorch.means.ConstantMean(batch_shape=torch.Size((2,))) ## RBF kernel if(kernel=='rbf' or kernel=='RBF'): # self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) self.base_covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) self.covar_module = gpytorch.kernels.InducingPointKernel( self.base_covar_module, inducing_points=inducing_points , likelihood=likelihood ) def forward(self, x): mean_x = self.mean_module(x) covar_x = self.covar_module(x) return gpytorch.distributions.MultivariateNormal(mean_x, covar_x) # Initialize model and likelihood inducing_point = train_x[:10] train_y = torch.round(train_y).long() likelihood = gpytorch.likelihoods.DirichletClassificationLikelihood(targets=train_y, learn_additional_noise=False) # NOTE THE TRANSFORM HERE model = ExactGPModel(train_x, likelihood.transformed_targets, likelihood, 'rbf', inducing_point) training_iterations = 200 # Find optimal model hyperparameters model.train() likelihood.train() # Use the adam optimizer optimizer = torch.optim.Adam(model.parameters(), lr=0.01) # "Loss" for GPs - the marginal log likelihood mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model) for i in range(training_iterations): # Zero backpropped gradients from previous iteration optimizer.zero_grad() # Get predictive output output = model(train_x) # Calc loss and backprop gradients # THE responses we train with are the transformed_targets here. loss = -mll(output, likelihood.transformed_targets).sum() loss.backward() # print('Iter %d/%d - Loss: %.3f' % (i + 1, training_iterations, loss.item())) if (i+1)%50==0: print(f'Iter {i + 1:02}/{training_iterations} - Loss: {loss.item():.4f}') optimizer.step()
Thank you very much for your clear explanation and code correcting; all errors have been resolved.
Hi,
I want to do binary classification with ExactGP and DirichletClassificationLikelihood. I have two problems: 1- mll=ExactMarginalLogLikelihood is not returning scaler loss, its shape is [2]. 2- I tried to use mll(output, train_y).sum(), and as a result, training could be done. But in test time I get errors in using the model to predict test_x, Error: shape '[100]' is invalid for the input of size 200. here is the code for reproducing errors.
What is wrong with my usage of DirichletClassificationLikelihood? Thanks in advance for the incoming help.