h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.85k stars 1.99k forks source link

Error when trying to retrieve standardized coefficients from a metalearner, AutoML #8559

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

After updating from version 3.26.0.3 to 3.26.0.8 and 3.26.0.9, I am not able to plot the standardized coefficients of my metalearner from AutoML. Here is the code:

This is the error I am getting:

!image-20191120-222911.png|width=365,height=253!

I also get the same error when I try to print the normalized coefficients

!image-20191120-222945.png|width=365,height=194!

Here is the code to reproduce it:

{code:python}#Import H2O and other libraries that will be used in this tutorial import h2o import matplotlib as plt import pandas as pd %matplotlib inline from h2o.automl import H2OAutoML

h2o.init(max_mem_size="4G")

import dataset

loan_level = h2o.import_file("https://s3.amazonaws.com/data.h2o.ai/H2O-3-Tutorials/loan_level_50k.csv")

split dataset

train, test = loan_level.split_frame([0.8], seed=42)

choose x and y variable

y_reg = "ORIGINAL_INTEREST_RATE" ignore_reg = ["ORIGINAL_INTEREST_RATE", "FIRST_PAYMENT_DATE", "MATURITY_DATE", "MORTGAGE_INSURANCE_PERCENTAGE", "PREPAYMENT_PENALTY_MORTGAGE_FLAG", "LOAN_SEQUENCE_NUMBER", "PREPAID", "DELINQUENT", "PRODUCT_TYPE"] x_reg = [i for i in train.names if i not in ignore_reg]

run AutoML

aml = H2OAutoML(max_models=5, seed=42, project_name='regression', stopping_metric="RMSE", sort_metric="RMSE") %time aml.train(x=x_reg, y=y_reg, training_frame=train)

Get model ids for all models in the AutoML Leaderboard

model_ids = list(aml.leaderboard['model_id'].as_data_frame().iloc[:,0])

Get the "All Models" Stacked Ensemble model

se = h2o.get_model([mid for mid in model_ids if "StackedEnsemble_AllModels" in mid][0])

Get the Stacked Ensemble metalearner model

metalearner = h2o.get_model(se.metalearner()['name']) %matplotlib inline metalearner.std_coef_plot(){code}

I was planning to show this at the PyData conference in 2 weeks, but I will leave it out for now.

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: apparently a consequence of [https://0xdata.atlassian.net/browse/PUBDEV-6874|https://0xdata.atlassian.net/browse/PUBDEV-6874|smart-link]

the {{std_coef_plot}} is designed for standardized coefficients, not for regular ones.

exalate-issue-sync[bot] commented 1 year ago

Neema Mashayekhi commented: Confirming that error also occurs in R.

{code:r}h2o.coef_norm(metalearner){code}

{code:r}Warning message in structure(object@model$coefficients_table$standardized_coefficients, : “Calling 'structure(NULL, )' is deprecated, as NULL cannot have attributes. Consider 'structure(list(), )' instead.”

Error in attributes(.Data) <- c(attributes(.Data), attrib): 'names' attribute [11] must be the same length as the vector [0] Traceback:

  1. h2o.coef_norm(metalearner)
  2. structure(object@model$coefficients_table$standardized_coefficients, . names = object@model$coefficients_table$names){code}
exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: created [https://0xdata.atlassian.net/browse/PUBDEV-7118|https://0xdata.atlassian.net/browse/PUBDEV-7118|smart-link] as root cause issue: solving this for GLM will also solve the AutoML metalearner problem.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7081 Assignee: New H2O Bugs Reporter: Franklin Alvarenga State: Open Fix Version: N/A Attachments: Available (Count: 4) Development PRs: N/A

Attachments From Jira

Attachment Name: image-2019-11-20-14-21-59-629.png Attached By: Franklin Alvarenga File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-7081/image-2019-11-20-14-21-59-629.png

Attachment Name: image-2019-11-20-14-23-30-815.png Attached By: Franklin Alvarenga File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-7081/image-2019-11-20-14-23-30-815.png

Attachment Name: image-20191120-222911.png Attached By: Franklin Alvarenga File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-7081/image-20191120-222911.png

Attachment Name: image-20191120-222945.png Attached By: Franklin Alvarenga File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-7081/image-20191120-222945.png