JakeColtman / bartpy

Bayesian Additive Regression Trees For Python
https://jakecoltman.github.io/bartpy/
MIT License
219 stars 44 forks source link

fit_predict missing parameter on the call to predict #25

Closed mattharrison closed 5 years ago

mattharrison commented 5 years ago

Shouldn't it be:

self.predict(X)

https://github.com/JakeColtman/bartpy/blob/a1a799b15e5d3c5b115c0eca7efd6f5cc4b14280/bartpy/sklearnmodel.py#L134

JakeColtman commented 5 years ago

I agree that it's a little unintuitive, but I think they net into the same thing. If no covariate matrix is passed to SklearnModel.predict, it defaults to returning the predictions of the original training set, i.e. X in this case.

The gain of this for BartPy is that it's much faster to calculate the prediction using the already computed samples than it is to generate out of sample estimates for all the trees in all of the samples.

I'll confirm this is the case later on though and at least add a comment to the line to make what's happening clear.

P.S. Thanks for raising an issue! It's a nice, if slightly odd, feeling to know people are looking at the code :)

mattharrison commented 5 years ago

Thanks for the great explanation. Looking forward to playing around with this library!

JakeColtman commented 5 years ago

Great! The library as a whole is pretty optimized for my own use cases, because I didn't really expect anyone else to stumble upon it, so please let me know if you have feedback / criticism. Likewise, if you're thinking of applying it to a problem, I'd be happy to help as a real world test case :)