h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.89k stars 2k forks source link

Maximum Recursion Depth, when creating rapids string #6547

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Hi!

This ticket is the follow-up ticket to: https://h2oai.atlassian.net/browse/PUBDEV-8960 (The old bug-ticket can be deleted).

The Dataset I am using has the following structure: time,(CH20 MIN) Flow Speed (m/s),...,(CH24 SAMPLE COUNT) Conductivity,id,classes (36 columns, time series, name=paradox dataset)


This is my code, which causes the error (shortened):

{code:python}

def jifa(): // x_test is a pandas dataframe features = create_features(x_test, column_id=automl_model.params.get('column_id'), column_value=automl_model.params.get('column_value'), column_kind=automl_model.params.get('column_kind'), column_sort=automl_model.params.get('column_sort'), settings=automl_model.feature_settings)

    features = convert_h2oframe_to_numeric(features, features.columns)
    *y_pre = automl_model.model.predict(features)['predict'].as_data_frame()*
    // This line triggers the error in h20

def create_features(data, column_id=None, column_value=None, column_kind=None, settings=None): """Load features.""" //extract_features is a method from ts-fresh to extract important features. features = extract_features(data, column_id=column_id, column_value=column_value, column_kind=column_kind, kind_to_fc_parameters=settings, impute_function=impute) features = h2o.H2OFrame(features).drop([0], axis=0) return features {code}


The traceback in h20 is the following:

{panel:title=Traceback in h20} File “/src/model_selection/model_selection.py”, line 79, in compute_metrics y_pre = automl_model.model.predict(features)['predict'].as_data_frame() {color:#14892c}<-- Still my code{color}

File "/usr/local/lib/python3.8/site-packages/h2o/model/model_base.py", line 280, in predict j = H2OJob(h2o.api("POST /4/Predictions/models/%s/frames/%s" % (self.model_id, test_data.frame_id), data = {'custom_metric_func': custom_metric_func}),

File "/usr/local/lib/python3.8/site-packages/h2o/frame.py", line 415, in frame_id return self._frame()._ex._cache._id

File "/usr/local/lib/python3.8/site-packages/h2o/frame.py", line 735, in _frame self._ex._eager_frame()

File "/usr/local/lib/python3.8/site-packages/h2o/expr.py", line 90, in _eager_frame self._eval_driver('frame')

File "/usr/local/lib/python3.8/site-packages/h2o/expr.py", line 113, in _eval_driver exec_str = self._get_ast_str(top)

File "/usr/local/lib/python3.8/site-packages/h2o/expr.py", line 151, in _get_ast_str exec_str = "({} {})".format(self._op, " ".join([ExprNode._arg_to_expr(ast) for ast in self._children]))

File "/usr/local/lib/python3.8/site-packages/h2o/expr.py", line 151, in exec_str = "({} {})".format(self._op, " ".join([ExprNode._arg_to_expr(ast) for ast in self._children]))

{color:#14892c}Afterwards it reaches the maximum recursion depth. It jumps back and forth between the mehtodes, until the mrd.{color} {panel}


If the _children of the ExpressionNode consist only out of ExpressionNodes and their children as well. Then it jumps between those mehtodes until the maximum recursion depth.


It is related to this ticket, which tries to replace the recursive build: https://h2oai.atlassian.net/browse/PUBDEV-8252

(This would solve the problem.)

If the _children of the ExpressionNode consist only out of ExpressionNodes and their children as well. Then it jumps between the methodes __arg_to_expr() _and __get_aststr()


I cannot upload the dataset to this bug-ticket, please send a mail to: [mailto:jakobkempter@gmail.com]. And I'll send the dataset to you.

exalate-issue-sync[bot] commented 1 year ago

Jakob Kempter commented: After creating an account, I can upload files. This is the dataset, which is causing the error.

[^data.csv]

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: Here is the description of the original ticket:

When performing the def _arg_to_expr(arg): in expr.py

If the _children of the ExpressionNode consists only out of ExpressionNodes and their children as well.

A maximum recursion depth is reached and the program crashes.

Why do I reach this maximum recursion depth? Is it caused by my dataset, my extracted features?

h2o-ops commented 1 year ago

JIRA Issue Details

Jira Issue: PUBDEV-8961 Assignee: Sebastien Poirier Reporter: N/A State: Open Fix Version: N/A Attachments: Available (Count: 1) Development PRs: N/A

h2o-ops commented 1 year ago

Attachments From Jira

Attachment Name: data.csv Attached By: Jakob Kempter File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8961/data.csv