szilard / benchm-ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
MIT License
1.87k stars 335 forks source link

Spark Random forest accuracy --spam? #49

Closed am9090 closed 7 years ago

am9090 commented 7 years ago

Hi guys

I was running random forest using spark in R

Can any one tell me how I get accuracy

I would have got normall r square but it drops certain row when random forest runs

so to get r square I need equal rows in original data and predicted data