yinlou / mltk

Machine Learning Tool Kit
BSD 3-Clause "New" or "Revised" License
136 stars 74 forks source link

Residuals not saved when building GAM #4

Closed sds-dubois closed 8 years ago

sds-dubois commented 8 years ago

On the Intelligible Models webpage you say that we should pass the option -R cal_housing.residual when building the GAM model (step 3), so that it can be used to detect the interactions in step4.

However this version of the code does not handle such option and does not save the residuals. Can you confirm that the residuals that you mention are those stored in rTrain in your code ? I've fixed it and can submit a pull request if you want.

By the way, the option -T is not handled neither, and so the score on the test set is not computed.

Thanks a lot for sharing your work!

yinlou commented 8 years ago

That webpage is out dated and should be updated (but after graduation I lost the access to edit that webpage anymore). To get residuals, you can use the following command: java mltk.predictor.evaluation.Predictor -R cal_housing.residual - r cal_housing_binned.attr -d cal_housing_binned.data

Take a look at the source code at: https://github.com/yinlou/mltk/blob/master/src/mltk/predictor/evaluation/Predictor.java

sds-dubois commented 8 years ago

Thanks! That's really helpful. Is there any other (updated or not) documentation ?

yinlou commented 8 years ago

I'm planning to setup another page for document, but I haven't got a chance to do so:)

sds-dubois commented 8 years ago

Ok, looking forward to see that!

And also I was wondering why you chose to do it in Java ? In particular regarding the speed, wouldn't it be faster in Python (with cython) ? (and many of the other models you implemented are already in sklearn right ?)