Clarification on the difference between an input vs. output kernel

Hi,

In addition, is it necessary that the y_kernel and x_kernel are the same? My intuition is that they should be. But what I can see from the code, that is not enforced. What is the rationale that the y and X could be projected to a different space?

We can use different kernels for input and output. For classification, in general, we need to use the delta kernel (otherwise, the performance can be poor for classification).

For instance, if we set the label of y takes 1,2,3 (i.e., 3 class classification). For the Gaussian kernel, we compute the kernel by exp(-||y_i - y_j||^2/(2s^2)), while we compute delta kernel as 1 if y_i and y_j are the same 0 otherwise. For Gaussian kernel case, the similarity between class 1 and class 3 is lower than the one with classes 1 and 2. This is not a good property for classification. (If we know class 1 and class 2 should be closer than class1 and class3, it may be a good idea to use the Gaussian kernel)

riken-aip / pyHSICLasso

Clarification on the difference between an input vs. output kernel #36