shogun-toolbox / shogun

Shōgun
http://shogun-toolbox.org
BSD 3-Clause "New" or "Revised" License
3.03k stars 1.04k forks source link

create unit test for MKLRegression #2638

Open karlnapf opened 9 years ago

karlnapf commented 9 years ago

With sensible data that makes sure things work as expected

sanuj commented 9 years ago

@karlnapf I read the MKL notebook but it only had classification. I had a look at the unit tests for other regression algorithms: Gaussian Process Regression, LibSVR and Least Angle Regression. In the first two cases 1d noisy sine wave was used for training and testing and lars had some raw data I guess. How was the comparison data decided? I think in GPR, gpml toolbox was used to generate the data and in SVR some easy.py script was used (I found easysvm.py when I searched for it, not sure if this is what you used.) So I am not able to figure out how to go about this. Please suggest some reading resources or something else so I can write unit tests for this.

sanuj commented 9 years ago

I assume I am not supposed to use the files you added in shogun-data because that's not hot it's done in other regression unit tests and those files are supposed to be for examples.

karlnapf commented 9 years ago

Unit tests compare on super simple cases, and we try to keep them independent of shogun-data. That is why most examples are hardcoded.

For MKL regression, what you would do is to find a reference implementation (there might be none), or design a simple example that produces results that you check by hand and verify to make a lot of sense, and then use that.

sanuj commented 9 years ago

So, I should use some external implementation of MKL regression (from some other machine learning toolbox or library) and get results for some set of data. Then I can use the results obtained with the same data in our unit tests to compare. Right?

And If I am not able to find any implementation then I should use some known function like 1d noisy sine wave as used in other unit-tests and compare the results?

iglesias commented 9 years ago

Yes, @sanuj. In the second case (the sine wave example you mentioned), it should be something as easy as possible so that the same results given by the unit test can be obtained using pen and paper.

sanuj commented 9 years ago

@karlnapf In LibSVR_unittest, you used a one dimensional quadratic function (features_train, features_test, labels_train, labels_test). I don't think you used labels_test anywhere in the code. What exactly is the function of easy.py that you used to compare data (Is it same as easysvm.py)? What was the purpose of using easy.py to compare instead of labels_test? Moreover, should I remove the 1d noisy sine wave comment used in this file because sine wave is not used in this unit-test and it confused me at first.

sanuj commented 9 years ago

http://asi.insa-rouen.fr/enseignants/~arakoto/code/mklindex.html I was planning to use this but it doesn't seem to work.

sanuj commented 9 years ago

SimpleMKL (the previous one) is a Matlab Toolbox. I will try SMO-MKL now. http://research.microsoft.com/en-us/um/people/manik/code/smo-mkl/download.html

iglesias commented 9 years ago

Hey, @sanuj. I just had a look at the unit test and you're definitely right; labels_test is not used. If I am not missing anything, the lines filling in the lab_test vector and creating the features from it can be removed. This test was probably created from another example where the test labels were used.

@karlnapf will have to tell you about the easy script, I don't know about that one. As I understand it, it has nothing to do with easysvm but the results used in the test are the output of LibSVM.