chriswbartley / monoensemble

High Performance Monotone Boosting and Random Forest Classification
http://monoensemble.readthedocs.io/en/latest/index.html
Other
5 stars 1 forks source link

Further corrections to work with current sklearn #5

Closed vasselai closed 3 years ago

vasselai commented 3 years ago

Further corrections are necessary to make 'monoensemble' work with current sklearn. The main ones that I need your attention to:

(1) the "presort" and "X_idx_sorted" sklearn parameters have been deprecated. See, respectively: https://github.com/scikit-learn/scikit-learn/pull/14907 https://github.com/scikit-learn/scikit-learn/issues/16818 Since I don't know how exactly do you prefer to handle that in light of the suggestions in the first link above, in order to at least leave 'monoensemble' in a working state, the only thing I did was to comment out "presort=self.presort" from line 1540 in the file 'mono_gradient_boosting.py'. But a more definitive solution will be necessary, since right now a FutureWarning is issued every iteration due to "X_idx_sorted" deprecation (which, besides being annoying, means that the code will soon be broken again if "X_idx_sorted" is not eliminated from the code base).

(2) in the line 436, from file 'mono_forest.py', the "_generate_unsampled_indices" throws an error because that function now has an extra parameters, 'n_samples_bootstrap': https://github.com/scikit-learn/scikit-learn/blob/4b8cd880397f279200b8faf9c75df13801cb45b7/sklearn/ensemble/_forest.py#L123 I obviously also do not know what is your preference here, but given the implementation in that link, it seems safe to assume that thus far your code was operating with the equivalent of 'n_samples_bootstrap = 1'. So that is what I imposed for now in the line 436, from file 'mono_forest.py'.

chriswbartley commented 3 years ago

Thanks Fabricio - I have just done a push to address all these deprecation issues and warnings: