h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.87k stars 1.99k forks source link

seeding runif on identical frames with different chunk distributions provides different results. #14580

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

uploaded_frame = h2o.upload_file(h2o.locate("bigdata/laptop/mnist/train.csv.gz")) r_u = uploaded_frame[0].runif(1234)

imported_frame = h2o.import_frame(h2o.locate("bigdata/laptop/mnist/train.csv.gz")) r_i = imported_frame[0].runif(1234)

print "This demonstrates that seeding runif on identical frames with different chunk distributions provides different results. upload_file: {0}, import_frame: {1}.".format(r_u.mean(), r_i.mean())

DinukaH2O commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-1610 Assignee: Spencer Butt Reporter: Eric Eckstrand State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A