soldasim / BOSS.jl

BOSS (Bayesian Optimization with Semiparametric Surrogate)
MIT License
2 stars 1 forks source link

AutoDiff fails with specific domains #4

Closed soldasim closed 1 year ago

soldasim commented 1 year ago

Autodiff returns NaNs with these specific settings:

To reproduce use the following domain in the /scripts/example.jl example.

domain = BOSS.Domain(;
    bounds = ([0.,0.], [20.,20.]),
    discrete = [true,false],
)

The issue is with differentiating the model posterior in acquisition function.

soldasim commented 1 year ago
julia> k = BOSS.AbstractGPs.Matern52Kernel()
Matern 5/2 Kernel (metric = Distances.Euclidean(0.0))

julia> k(0., 0.)
1.0

julia> BOSS.ForwardDiff.derivative(x->k(x,0.), 0.)
NaN

julia> 

The covariance function of the kernel is not differentiable for k(x,x) (i.e. if both arguments are equal). This makes the GP posterior non-differentiable.

If all dimensions are discrete, the posterior is differentiable. Probably thanks to the round function in DiscreteKernel.

EDIT: This does not seem to be the issue.

soldasim commented 1 year ago
julia> k = BOSS.DiscreteKernel(BOSS.AbstractGPs.Matern52Kernel(), [true, false])
BOSS.DiscreteKernel{Vector{Bool}}(Matern 5/2 Kernel (metric = Distances.Euclidean(0.0)), Bool[1, 0])

julia> BOSS.ForwardDiff.gradient(x->k(x,[0.,0.]), [1.,1.])
2-element Vector{Float64}:
  0.0
 -0.29364327535004525

julia> 

Differentiation through DiscreteKernel with one discrete and one continous dimension works fine.

soldasim commented 1 year ago

The problem seems to be the differentiation of the variance of the nonparametric posterior. (When at least one dimension is discrete and at least one dimension is continuous.)

AbstractGPs.var(::FiniteGP)

soldasim commented 1 year ago

The nonparametric posterior differentation works with kernels of type SimpleKernel. (It has something to do with them using the kappa and metric functions instead of simply using (::SimpleKernel)(x, y).)

Solution is to make BOSS.DiscreteKernel work (when the inner kernel is a SimpleKernel) is to implement method KernelFunctions.kernelmatrix_diag(::DiscreteKernel, ::AbstractVector) and make it call the KernelFunctions.kernelmatrix_diag method of the inner kernel.