Is it not misleading to say that BorutaPy is "exactly the same ..." as the original Boruta algorithm implemented in R, when it is clear that BorutaPy makes use of the native feature selection importances scores, derived from gini impurity, in comparison to the original and still implemented R versions "mean decrease in accuray" (mda) feature importance approximation method.
It is known that gini impurity feature importance scores are biased towards features with high cardinalities and the results between the gini impurity and mda approach is vastly different.
It is necessary to either state this more clearly in the documentation, or append the necessary methods to BorutaPy.
Is it not misleading to say that BorutaPy is "exactly the same ..." as the original Boruta algorithm implemented in R, when it is clear that BorutaPy makes use of the native feature selection importances scores, derived from gini impurity, in comparison to the original and still implemented R versions "mean decrease in accuray" (mda) feature importance approximation method.
It is known that gini impurity feature importance scores are biased towards features with high cardinalities and the results between the gini impurity and mda approach is vastly different.
It is necessary to either state this more clearly in the documentation, or append the necessary methods to BorutaPy.