h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.9k stars 2k forks source link

Refactor and simplify implementation of Pearson Correlation #10152

Closed exalate-issue-sync[bot] closed 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Modified calculation of Pearson Correlation Coefficient: Remove CorTaskCompleteObs and now use CorTask as main worker in calculating correlation. Remove CorTaskCompleteObsMean and now use CorMean to get mean across Vecs/columns. Add another runit for h2o.cor(). Matches base R implementation. This new implementation should be simpler and also achieves the same results. Previously this had issues across multiple chunks. New implementation will not have this issue anymore.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-3235 Assignee: Navdeep Reporter: Navdeep State: Resolved Fix Version: 3.10.0.6 Attachments: N/A Development PRs: N/A