[Docs] Regarding using additive kernels in SV Deep Kernel learning for classification

cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch

MIT License

3.52k stars 553 forks source link

[Docs] Regarding using additive kernels in SV Deep Kernel learning for classification #1202

Open riddikkulus96 opened 4 years ago

riddikkulus96 commented 4 years ago

Hello,

I've been following the SVDKL example provided in the documentation. If I remember the source paper correctly, they say that the outputs of the neural network are each fed to a set of independent GPs (in other words, each GP gets a subset of features) and these GPs are merged together in an additive manner. The current example shows using a single RBF Kernel to do the same. I'm confused as to how I can go about creating this "independent GPs for a subset of the features" idea. Any recommendations?

KeAWang commented 4 years ago

You can add Kernels:

class GPModel(ApproximateGP):
    def __init__(self, inducing_points):
        variational_distribution = CholeskyVariationalDistribution(inducing_points.size(0))
        variational_strategy = VariationalStrategy(self, inducing_points, variational_distribution, learn_inducing_locations=True)
        super(GPModel, self).__init__(variational_strategy)
        self.prior_mean = gpytorch.means.ConstantMean()
        self.kernel1 = ScaleKernel(RBFKernel())
        self.kernel2 = ScaleKernel(RBFKernel())

    def forward(self, x):
        mean = self.prior_mean(x)
        covar = self.kernel1(x[:, :2]) + self.kernel2(x[:, 2:])
        return gpytorch.distributions.MultivariateNormal(mean, covar)

gpleiss commented 4 years ago

@riddikkulus96 the output of the GPModel is a independent MultitaskMultivariateNormal distribution (i.e. a batch of independent Gaussians). The likelihood then passes these independent Gaussians through a linear mixing layer to produce the softmax outputs. The linear mixing layer performs the additive composition of these independent GPs.