Hellinger Distance criterion for sklearn Random Forest and Decision Tree classifiers
I'm working on adding this to scikit-learn-contrib/imbalanced-learn PR #437
You will need a cython "header" file (.pxd) from sklearn.
In case you've installed sklearn from source code package, you've already got it.
In case you've installed sklearn using pip install sklearn then you need to get it.
python setup.py build_ext --inplace
>>> import numpy as np
>>> from hellinger_distance_criterion import HellingerDistanceCriterion
>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> hdc = HellingerDistanceCriterion(1, np.array([2],dtype='int64'))
>>> clf = RandomForestClassifier(criterion=hdc, max_depth=4, n_estimators=100)
>>> clf.fit(X_train, y_train)
>>> print('hellinger distance score: ', clf.score(X_test, y_test))