kapelner / bartMachine

An R-Java Bayesian Additive Regression Trees implementation
MIT License
62 stars 27 forks source link

interaction constraints #25

Closed noamross closed 2 years ago

noamross commented 6 years ago

I was wondering if it would be possible to enable interaction constraints to bartMachine. These have recently been added to xgboost (link). Having interaction constraints would be very handy for fitting models where some variables are held out from the whole response surface, e.g, y ~ f(x1) + f(x2, x3, x4, x5,...).

kapelner commented 6 years ago

Hey Noam,

I think the easiest way to do this is to make two different bart models (1) just x_1 and (2) for x_2, x_3, ... To make the first, just fit y with x_1 using bartmachine or a cv-bartmachine. To make the second, first compute y' := y - yhat_1 where yhat_1 are the predicted values using the first bartmachine. Then use bartmachine or cv-bartmachine to fit y' using x_2, x_3, ...

If you want, I can write a simple wrapper for this.

Adam

On Fri, Sep 21, 2018 at 11:20 PM Noam Ross notifications@github.com wrote:

I was wondering if it would be possible to enable interaction constraints to bartMachine. These have recently been added to xgboost (link https://xgboost.readthedocs.io/en/latest/tutorials/feature_interaction_constraint.html). Having interaction constraints would be very handy for fitting models where some variables are held out from the whole response surface, e.g, y ~ f(x1) + f(x2, x3, x4, x5,...).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kapelner/bartMachine/issues/25, or mute the thread https://github.com/notifications/unsubscribe-auth/AAoK-BxYVx6uIOSggZReaV-5cDFaG22Lks5udUoPgaJpZM4W0z9N .

-- Adam Kapelner, Ph.D. Assistant Professor of Mathematics Queens College, City University of New York 65-30 Kissena Blvd., Kiely Hall Room 604 Flushing, NY, 11367 M: 516-435-6795 kapelner.com (scholar https://scholar.google.com/citations?user=TzgMmnoAAAAJ|research gate http://www.researchgate.net/profile/Adam_Kapelner2|publons https://publons.com/author/431881/adam-kapelner#profile)

noamross commented 6 years ago

Thanks! No need, I'll give this try.

realkrantz commented 5 years ago

Hi @kapelner. Is there a possibility to share this wrapper? Thanks for this very helpful package.

kapelner commented 2 years ago

It turns out the iterative GAM idea I had didn't work. So I forced the constraints in the individual trees. There is now a interaction_constraints argument passed to the model constructor. You pass in a list of vectors indicating where the vectors are sets of elements allowed to interact with one another. The elements in each vector correspond to features in the data frame X specified by either the column number as a numeric value or the column name as a string e.g. list(c(1, 2), c("nox", "rm")). The elements of the vectors can be reused among components for any level of interaction complexity you wish. Default is NULL which corresponds to the vanilla modeling procedure where all interactions are legal. For a pure generalized added model, use as.list(seq(1 : p)) where p is the number of columns in the design matrix X.

I released this as v1.2.7 and it's submitted to CRAN as well.