neurodata / SPORF

This is the implementation of Sparse Projection Oblique Randomer Forest
https://neurodata.io/forests/
97 stars 46 forks source link

predict_proba and predict_log_proba return Nans #316

Closed dmltv closed 5 years ago

dmltv commented 5 years ago

The predict function returns the predicted labels, but predict_proba and predict_log_proba do not return the expected probabilities / log-probabilities. These should be very easy to compute, it would be very convenient to have them exposed in the python interface.

rerf.predict_proba() returns all 0's
and rerf.predict_log_proba() returns all Nans.

I would expect rerf.predict_proba() to return averaged frequency of positive example in the leaf corresponding to the input (across all trees in the forest), and log of that for predict_log_proba.

I downloaded the latest version of the package a few days ago.
Thank you!

falkben commented 5 years ago

Thanks for letting us know.

I'm wondering if there's some namespace confusion going on. I'm assuming your classifier is named rerf. But that's the same name as the module. The convention I've seen is to name your classifier clf. When I call clf.predict_proba I get a numpy array back with the probability of each class.

Can you share a full example of this bug?

Thanks!

dmltv commented 5 years ago

Thank you, its good to know the functionality is there. I believe the error may be due to an incorrect installation, i had to install a different version of eigen, and make some changes to get it to run. The call was a typo, i used the official iris example with clf.predict_proba. Thanks again!