A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
MIT License
1.87k
stars
335
forks
source link
Integer encoding for categorical variables in random forests in R #22
This quote stuck out to me:
Did you try integer-encoding categories? It looks like you did for python, maybe that's worth trying with R.