JuliaGaussianProcesses / KernelFunctions.jl

Julia package for kernel functions for machine learning
https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/
MIT License
267 stars 32 forks source link

Does it allow to add a new custom kernel by the user? #558

Closed WuSiren closed 3 months ago

WuSiren commented 3 months ago

Thanks for this useful package first!

Now I want to add a customized new kernel, can I? And how? Is there any guide for that?

Thank you in advance!

willtebbutt commented 3 months ago

Yes. Please take a look at the "Custom Kernels" section of the docs for info on how to go about this. Please do ask for more info if something is not clear.

WuSiren commented 3 months ago

Oh, sorry. I mistook here for the entire documentation. I'll read it. Thanks!

willtebbutt commented 3 months ago

Ahh I see. We should figure out why that is happening -- I don't know why juliahub is grabbing an old version of the docs.

WuSiren commented 3 months ago

Hi, @willtebbutt ! May I ask you some questions on the custom kernel?

Q1. I'm planning to define a custom kernel function depending on a metric that seems not to be listed here. Now should I define the metric first and then define the kernel as a SimpleKernel, or should I define the kernel as a more complex kernel directly?

Q2. Is there more information on the trainable kernel? There are some parameters in the kernel I plan to construct that had better to be specified depending on the information from the training data. Will this be a trainable kernel? Do I need to use Functors.jl? What will I lose if I don't use it?

Q3. How should I define a argument check within the kernel struct of my kernel?

Looking forward to your reply at your convenience.

willtebbutt commented 3 months ago

Q1. I'm planning to define a custom kernel function depending on a metric that seems not to be listed here. Now should I define the metric first and then define the kernel as a SimpleKernel, or should I define the kernel as a more complex kernel directly?

Either of these should be fine -- I would just go with whichever is most convenient for you. Do note that you may need to define methods of kernelmatrix directly (see bullet point 4 of https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/create_kernel/#Additional-Options) in order to get good performance).

Q2. Is there more information on the trainable kernel? There are some parameters in the kernel I plan to construct that had better to be specified depending on the information from the training data. Will this be a trainable kernel? Do I need to use Functors.jl? What will I lose if I don't use it?

If the parameters in your kernels are a fixed function of your training data (i.e. you're not optimising them), then there is really nothing to worry about here. My advice for if you have parameters that you're planning to optimise (say, using Optim.jl or Optimisers.jl) is to make use of ParameterHandling.jl to make it easy to create your kernel. See e.g. https://github.com/JuliaGaussianProcesses/ParameterHandling.jl for more info. Note that when using ParameterHandling.jl, you don't need to modify your kernel at all to accomodate its use.

Q3. How should I define a argument check within the kernel struct of my kernel?

I would advise using our checkargs macro. See e.g. https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/blob/c38b36639075e58a3f3add988c52282fc0564035/src/kernels/scaledkernel.jl#L20 . It's known to work well with our existing AD tools.

Please do let me know if you have any more questions, or if any of this is unclear!

WuSiren commented 3 months ago

I can't thank you more for you kind help!!! 🤝🤝🤝

WuSiren commented 3 months ago

Do note that you may need to define methods of kernelmatrix directly (see bullet point 4 of https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/create_kernel/#Additional-Options) in order to get good performance).

I just have a small question here. In what situation should I define methods for kernelmatrix additionally? I found it worked well using the built-in kernelmatrix function. Is there any other (better) method than computing the kernel matrix element-wisely i.e., [mykernel(x, y) for x in eachcol(data), y in eachcol(data)] ?

Do you mean parallel computing or something related?

willtebbutt commented 3 months ago

I just have a small question here. In what situation should I define methods for kernelmatrix additionally? I found it worked well using the built-in kernelmatrix function. Is there any other (better) method than computing the kernel matrix element-wisely i.e., [mykernel(x, y) for x in eachcol(data), y in eachcol(data)] ?

Do you mean parallel computing or something related?

If the default elementwise approach is working well for you, then there is no need to implement anything else.

It is often the case, however, that it is possible to take advantage of e.g. matrix-matrix multiplications to perform much of the work involving in constructing a kernel matrix. Since things like matrix-matrix multiplications are highly optimised, this can offer much improved performance in some situations. For example, if you look at our implementation of the SEKernel, you'll see it makes use of the SqEuclidean metric to compute squared pairwise distances between all pairs of data points -- under the hood this uses matrix-matrix multiplications, which yields substantial speedups over elementwise computation.

Whether or not these kinds of optimisations would be useful in your case of course depend on your particular kernel!

WuSiren commented 3 months ago

I see! Thank you very much!