Closed tobigithub closed 7 years ago
mean(phat$Accuracy)
does not give you AUC https://www.kaggle.com/wiki/AreaUnderCurve
Hi, I see, thank you : "give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime" I corrected the code above. Tobias
OK, cool.
Hi, this is the corrected R code for for H2O cluster version: 3.8.3.3 and R 3.3.1 The old code would not run under these versions. The final AUC with sample_rate = 1.0 for 1 million records is 0.77 which tops the old results.
For 10M the AUC is 0.7922 for a quad core CPU@4Ghz in 1676.02 seconds (more accurate and also 2x faster than the current report with a 32 thread machine and using only two GByte of RAM).
This needs code validation.