JuliaGaussianProcesses / KernelFunctions.jl

Julia package for kernel functions for machine learning
https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/
MIT License
267 stars 32 forks source link

Quick question #555

Open maxasauruswall opened 5 months ago

maxasauruswall commented 5 months ago

Hi All,

Thanks for the wonderful package.

I'm just starting to explore the source code. I was curious about a very small decision.

If I do:

k = SqExponentialKernel()

then k.metric is an instance of Distances.Euclidean, but if I do metric(k), it returns an instance of Distances.SqEuclidean. See: https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/blob/master/src/basekernels/exponential.jl#L21-L27

Just wondering why the struct stores a different metric than it invokes, if I do, say: k(x1, x2), which calls metric(k)(x1, x2)? Why not just make the default metric SqEuclidean?

Thanks for the help, and the package.

Cheers, Max

willtebbutt commented 5 months ago

Hi Max, thanks for opening the issue.

There are two different notions of metric in the this kernel:

  1. a computational device -- the Distances.SqEuclidean metric is used in the computation of kernelmatrix et al, e.g. here
  2. a semantic thing to specify the definition of the kernel.

Concretely, KernelFunctions.jl's definition of the exponentiated quadratic kernel is k(x, y) = exp(-k.metric(x, y)^2 / 2). However, the best way to compute the kernel in the case where k.metric is the euclidean distance is not to first compute the euclidean distance between x and y, and then to square the result, it's just to compute the squared euclidean distance directly. This is where metric(k) comes in -- if you take a look at the link above you'll see how it works.

I'm still on the fence about this design choice because a) you can't just change k.metric to a different metric and still be guaranteed to have a valid kernel, and b) it's confusing. I think we're going to stick with it for now though, but it might get revised in some future (breaking) release of KernelFunctions.jl.

Does this explain what's going on?

maxasauruswall commented 5 months ago

Hi Will,

I think I get it. The efficiency comes here, where, if you were to take the Euclidean distance, you'd have to take the square root, then square it again, correct? And by using SqEuclidean, you just skip both operations?

It is a bit confusing, but the complication is hidden behind the convention of not directly accessing struct fields, so makes sense.

Thanks! Max