Closed dalekube closed 3 years ago
I believe sample fraction is only used on fitting, so we can just pass sample_fraction=[1]
on predict. I'll take a look at this.
I believe sample fraction is only used on fitting, so we can just pass
sample_fraction=[1]
on predict. I'll take a look at this.
I do not specify the sample_fraction in the predict call. It looks like predict() inherits the parameter from fit() even though it's not relevant for predictions.
Right, I should have been more specific. The entrypoint into the ranger bindings is the same for fit/predict, and right now we pass the specified sample_fraction
for predicting too: https://github.com/crflynn/skranger/blob/d409bd8fff9260fe5a30f9837965b5fd05418250/skranger/ensemble/ranger_forest_regressor.py#L357
When creating the forest the C++ does a check thats failing: https://github.com/imbs-hl/ranger/blob/e8b05f47892bb4968c4e6057f68b35bcd0b52225/src/Forest.cpp#L256, even on predict when it doesn't have to.
Problem solved with the 0.3.2 release. Confirmed in my application that predict()
ignores the sample_fraction parameter according to the new design. Thanks!
The use of 'sample_fraction' in the fitting of the RangerForestRegressor causes the error
terminate called after throwing an instance of 'std::runtime_error' what(): sample_fraction too small, no observations sampled.
when the trained model is used for a prediction,rfr.predict()
for only one observation.The error does not occur when more than one observation is inputted into
rfr.predict()
. Also, this error does not occur whensample_fraction=[1]
and making a prediction with one observation.