Random features for KRR

shogun-toolbox / shogun

Shōgun

http://shogun-toolbox.org

BSD 3-Clause "New" or "Revised" License

3.03k stars 1.04k forks source link

Random features for KRR #3024

Open karlnapf opened 8 years ago

karlnapf commented 8 years ago

This entrance task is to use Shogun's Random Fourier feature framework to perform approximate kernel ridge regression. Most likely this involves coding up a LinearRegression class (or using existing) and pass the random fourier features in there. A demo in a notebook is also desirable.

If any questions, please ask, as this is a medium scale task

sanuj commented 8 years ago

@karlnapf I'll start working on this. CKernelRidgeRegression already exists. Shall I just create a notebook by using CRandomFourierDotFeatures or is there more to it?

karlnapf commented 8 years ago

Hi @sanuj so CKernelRidgeRegression is the dual form of KRR, where a NxN matrix is inverted (the kernel matrix). With random features, this will be a MxM matrix where M is the number of random fourier features. Therefore, what we need to do here is to do a linear ridge regression in the random fourier feature space.

If it is already possible to do this, then you can create a notebook on approximate kernel methods. But I have the feeling some changes in the code might be needed. Looking forward to the outcome

sanuj commented 8 years ago

@karlnapf You are correct. I will have to change some cpp code. I thought that I can get the features from CRandomFourierDotFeatures and pass them to CLinearRidgeRegression but that's not possible because CLinearRidgeRegression expects CDenseFeatures. Either I change something in CLinearRidgeRegression or make a new regression class which works with CRandomFourierDotFeatures. Maybe I can make a new class called CRandomFourierRidgeRegression, this will be easier than changing CLinearRidgeRegression (assuming that's even possible).

I read this paper as mentioned in the fourier features docs. So it maps the data in a lower dimension and then we have to do linear ridge regression (which is just linear regression with L2 regularization). Since the training data will be high dimensional, I won't be able to plot the regression curve in the notebook. Shall I plot a mean squared error curve against tau and compare this with some other regression method?

karlnapf commented 8 years ago

mmh I mean KRR with random features is exactly linear ridge regression in a different feature space. So the transformed features should just be passed there -- better re-use existing code. But a wrapper class might help, call it CApproximateKRR -- it is not limited to random fourier features, there are other ways to do finite feature space approximations. Also looking at scikitlearn might be a good idea to see how they do it. The paper is exactly the one you should read :)

Examples you should put should reproduce some of the results in some papers.

illustrative example
x-axis: number of rff features, y-axis performance on test set. All this for fixed number of data (large)
The idea is that you show the effect of the approximation both in training time and in accuracy
Pick a few synthetic and a few real world examples (see papers for inspiration)
first step is to make it work and have a nice interface, examples come later