Open snowde opened 6 years ago
Thanks Stefan,
I have seen some development here, but I haven't tested it out. If you feel the need to comment about it please do.
https://github.com/kjung/scikit-learn [https://avatars1.githubusercontent.com/u/1491410?s=400&v=4]https://github.com/kjung/scikit-learn
kjung/scikit-learnhttps://github.com/kjung/scikit-learn A version of scikit-learn that includes implementations of Wager & Athey and Scott Powers causal forests. github.com
Regards, Derek
From: Stefan Wager notifications@github.com Sent: 01 July 2018 07:09 To: swager/grf Cc: snowde; Author Subject: Re: [swager/grf] Any python implementations ? (#257)
We don't have one currently available. We hope to have one eventually, but our first priority is getting the R implementation to 1.0.
If anyone is interested in contributing a Python implementation, please reach out to us.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/swager/grf/issues/257#issuecomment-401560458, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ATXmTyPdKgJBqjjJIwhvRGQL3lTNp8weks5uB8zkgaJpZM4U91Xj.
Interesting -- it looks like that package uses the "old" causal forest. The grf package has a substantially different implementation of causal forests than we had in the first JASA paper on this (both statistically and computationally). See section 6.2 of https://arxiv.org/pdf/1610.01271.pdf for a discussion + simulation examples.
The upshot is that any Python implementation should just be an interface to the c++ core of grf. Hopefully this will be a reasonable amount of effort, as much of the complexity of this package is in the c++, not in the R.
Even if a Python implementation doesn't exist, you can still use grf in Python via the mostly excellent rpy2 package.
# Calling R
from rpy2.robjects.packages import importr
from rpy2.robjects import numpy2ri
numpy2ri.activate()
grf = importr("grf")
# Fitting a regression forest
grf.regression_forest(X=X, Y=Y, W=W) # X, Y, W are numpy arrays
For reference here is an old Cython wrapping of some basic functionality in grf v 0.10.2, but as @halflearned mentions it is easier to just use rpy2.
I've been working on one here: https://github.com/crflynn/skgrf
It's a bit hastily done and might require some review, particularly on the boosted regressor, but it provides sklearn python classes for all the estimators available in 1.2.0 except for the custom forest.
We don't have one currently available. We hope to have one eventually, but our first priority is getting the R implementation to 1.0.
If anyone is interested in contributing a Python implementation, please reach out to us.