h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.87k stars 1.99k forks source link

Splitting Frame with Time type columns Breaks with 'NewChunk has type Numeric, but the Vec is of type Time' #11971

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

When I try to split on a version of the lending club dataset, which that contains the time type columns, while using either the nightly (3.15.0.4107 ) or 3.14.07 H2O I get the following error

{code} Error calling POST /99/Rapids with opts {"ast":"(, (tmp= flow_e5f4643c3e7746ada...

ERROR MESSAGE: DistributedException from localhost/127.0.0.1:54321: 'NewChunk has type Numeric, but the Vec is of type Time' {code}

The dataset to try is attached:

{code} library(h2o) h2o.init() file_path <- "your/path/to/lc_fe_target_single_time_cols.csv" loan_stats <- h2o.importFile(path = file_path)

split into train and valid frames

split <- h2o.splitFrame(loan_stats, ratios = c(0.7, .15), destination_frames=c("hf_loan_train", "hf_loan_valid","hf_loan_test"), seed = 1234) hf_loan_train <- split[[1]] hf_loan_valid <- split[[2]]

{code} [^lc_fe_target_single_time_cols_subset.csv]

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5098 Assignee: Michal Kurka Reporter: Lauren DiPerna State: Open Fix Version: N/A Attachments: Available (Count: 1) Development PRs: N/A

Attachments From Jira

Attachment Name: lc_fe_target_single_time_cols_subset.csv Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-5098/lc_fe_target_single_time_cols_subset.csv