cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.55k stars 557 forks source link

Hi, I am [Aryan](https://aryandeshwal.github.io/). I am a PhD student at Washington State University. I like GPyTorch a lot and regularly use it in my own research on combinatorial Bayesian optimization. Thanks for the great library! I think adding [diffusion kernel](https://www.ml.cmu.edu/research/dap-papers/kondor-diffusion-kernels.pdf) for discrete/categorical input spaces will be a nice addition to the library. It is a very useful (extension of RBF to discrete spaces) kernel for Bayesian optimization over [discrete/combinatorial](https://arxiv.org/abs/2012.07762) [spaces](https://arxiv.org/abs/1902.00448) (potentially good for [BoTorch](https://github.com/pytorch/botorch)). #1424

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hi, I am Aryan. I am a PhD student at Washington State University. I like GPyTorch a lot and regularly use it in my own research on combinatorial Bayesian optimization. Thanks for the great library! I think adding diffusion kernel for discrete/categorical input spaces will be a nice addition to the library. It is a very useful (extension of RBF to discrete spaces) kernel for Bayesian optimization over discrete/combinatorial spaces (potentially good for BoTorch).

I am providing my implementation of the diffusion kernel in GPyTorch format below, hoping it will be useful. It is a batch-compatible implementation for the ARD version of the kernel that supports arbitrary number of categories in each dimension.

from typing import Optional
import torch
from gpytorch.constraints import Interval, Positive
from gpytorch.priors import Prior
from gpytorch.kernels import Kernel

class DiffusionKernel(Kernel):
    r"""
        Computes diffusion kernel over discrete spaces with arbitrary number of categories. 
        Input type: n dimensional discrete input with c_i possible categories/choices for each dimension i 
        As an example, binary {0,1} combinatorial space corresponds to c_i = 2 for each dimension i
        References:
        - https://www.ml.cmu.edu/research/dap-papers/kondor-diffusion-kernels.pdf (Section 4.4)
        - https://arxiv.org/abs/1902.00448
        - https://arxiv.org/abs/2012.07762

        Args:
        :attr:`categories`(tensor, list):
            array with number of possible categories in each dimension            
    """
    has_lengthscale = True
    def __init__(self, categories, **kwargs):
        if categories is None:
            raise RunTimeError("Can't create a diffusion kernel without number of categories. Please define them!")
        super().__init__(**kwargs)
        self.cats = categories

    def forward(self, x1, x2, diag: Optional[bool] = False, last_dim_is_batch: Optional[bool] = False, **params):
        if last_dim_is_batch:
            x1 = x1.transpose(-1, -2).unsqueeze(-1)
            x2 = x2.transpose(-1, -2).unsqueeze(-1)        

        if diag:
            res = 1.
            for i in range(x1.shape[-1]):
                res *= ((1 - torch.exp(-self.lengthscale[..., i] * self.cats[i]))/(1 + (self.cats[i] - 1) * torch.exp(-self.lengthscale[..., i]*self.cats[i]))).unsqueeze(-1) ** ((x1[..., i] != x2[..., i])[:, 0, ...])
            return res

        res = 1.
        for i in range(x1.shape[-1]): 
            res *= ((1 - torch.exp(-self.lengthscale[..., i] * self.cats[i]))/(1 + (self.cats[i] - 1) * torch.exp(-self.lengthscale[..., i]*self.cats[i]))).unsqueeze(-1) ** ((x1[..., i].unsqueeze(-2)[..., None] != x2[..., i].unsqueeze(-2))[0, ...])
        return res

Thanks and Happy new year!

Originally posted by @aryandeshwal in https://github.com/cornellius-gp/gpytorch/discussions/1411

gpleiss commented 3 years ago

Can you open up a pull request rather than an issue?