Open YuelingMa0 opened 1 month ago
You can use the predict_per_tree
function from the Forest Inference Library (FIL). Note that this feature is only available from the experimental version of FIL.
from cuml.experimental import ForestInference
# ...
fm = ForestInference.load_from_sklearn(skl_model)
pred_per_tree = fm.predict_per_tree(X) # Returns array of size (num_row, num_tree, leaf_size)
Thank you!
I got the error "Negative size passed to PyBytes_FromStringAndSize" when I loaded sklearn model. I am also curious if "predict_per_tree" attribute also works for a model trained by cuml?
"Negative size passed to PyBytes_FromStringAndSize" when I loaded sklearn model.
Can you share the model with us so that we can troubleshoot?
I am also curious if "predict_per_tree" attribute also works for a model trained by cuml?
Yes, it should work with a cuML model.
Here are my random forest models, one trained using sklearn and the other trained using cuml. I converted the random forest model trained using cuml to ForestInference, and tried to use "predict_per_tree" for the cuML model. I obtained an attribute error "AttributeError: predict_per_tree". I am using the version 24.10.00.
Is your feature request related to a problem? Please describe. I wish I could use cuml to obtain individual tree results of a random forest. However, this function is not supported in the current cuml package. Using the random forest regression function in the current cuml package, I can only obtain the average of tree results.
Describe the solution you'd like An attribute in the existing random forest regression function to provide results from each tree.
Describe alternatives you've considered I have been using the "estimator_" in the RandomForestRegressor function of scikit-learn to obtain individual tree ouputs, but that package only works on CPUs.
Additional context https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html