h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

Autoencoder with weights column gives an npe #12098

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

A better error msg would be good.

screenshot attached reproable on any dataset

here reproed using ecgdata(from small data) using first column as weights and activation tanh {code:java} buildModel 'deeplearning', {"model_id":"deeplearning-a3003de2-ced2-4861-8e2c-847dfbda8c1b","training_frame":"ecg_discord_test.hex","nfolds":0,"ignored_columns":[],"ignore_const_cols":true,"activation":"Tanh","hidden":[200,200],"epochs":10,"variable_importances":true,"score_each_iteration":false,"weights_column":"C1","max_hit_ratio_k":0,"checkpoint":"","standardize":true,"train_samples_per_iteration":-2,"adaptive_rate":true,"input_dropout_ratio":0,"l1":0,"l2":0,"loss":"Automatic","distribution":"AUTO","quantile_alpha":0.5,"huber_alpha":0.9,"score_interval":5,"score_training_samples":10000,"score_validation_samples":0,"score_duty_cycle":0.1,"stopping_rounds":5,"stopping_metric":"AUTO","stopping_tolerance":0,"max_runtime_secs":0,"autoencoder":true,"categorical_encoding":"AUTO","pretrained_autoencoder":"","overwrite_with_best_model":true,"target_ratio_comm_to_comp":0.05,"seed":-1,"rho":0.99,"epsilon":1e-8,"nesterov_accelerated_gradient":true,"max_w2":3.4028235e+38,"initial_weight_distribution":"UniformAdaptive","classification_stop":0,"regression_stop":0.000001,"score_validation_sampling":"Uniform","diagnostics":true,"fast_mode":true,"force_load_balance":true,"single_node_mode":false,"shuffle_training_data":false,"missing_values_handling":"MeanImputation","quiet_mode":false,"sparse":false,"col_major":false,"average_activation":0,"sparsity_beta":0,"max_categorical_features":2147483647,"reproducible":false,"export_weights_and_biases":false,"mini_batch_size":1,"elastic_averaging":false} {code}

{code:java} java.lang.NullPointerException at hex.glm.GLMTask$YMUTask.responseSDs(GLMTask.java:201) at hex.deeplearning.DeepLearning.makeDataInfo(DeepLearning.java:114) at hex.deeplearning.DeepLearningModel.(DeepLearningModel.java:223) at hex.deeplearning.DeepLearning$DeepLearningDriver.buildModel(DeepLearning.java:230) at hex.deeplearning.DeepLearning$DeepLearningDriver.computeImpl(DeepLearning.java:216) at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:206) at hex.deeplearning.DeepLearning$DeepLearningDriver.compute2(DeepLearning.java:209) at water.H2O$H2OCountedCompleter.compute(H2O.java:1263) at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) {code}

exalate-issue-sync[bot] commented 1 year ago

Michal Kurka commented: The solution is to disallow weights column for autoencoder. This should be check in Java, documentation string of weights_column should reflect that.

I will leave it up to the assignee if checks will be done also in R/Python.

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5226 Assignee: New H2O Bugs Reporter: Nidhi Mehta State: Open Fix Version: N/A Attachments: Available (Count: 1) Development PRs: N/A

Attachments From Jira

Attachment Name: Screen Shot 2018-01-15 at 3.47.50 PM.png Attached By: Nidhi Mehta File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-5226/Screen Shot 2018-01-15 at 3.47.50 PM.png