automl / Auto-PyTorch

Automatic architecture search and hyperparameter optimization for PyTorch
Apache License 2.0
2.37k stars 287 forks source link

Imputation issues #30

Closed LMZimmer closed 3 years ago

LMZimmer commented 4 years ago

If only the validation split contains NaNs in certain columns, there might be errors.

le-dawg commented 4 years ago

Oh, might this be related to my error (works flawlessly on the examples, throws errors on my dataset):

File "/usr/local/lib/python3.6/dist-packages/autoPyTorch-0.0.2-py3.6.egg/autoPyTorch/pipeline/nodes/imputation.py", line 29, in dataset_info.categorical_features = [dataset_info.categorical_features[i] for i, is_nan in enumerate(all_nan) if not is_nan] ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

My fix seemed to solve the issue but created a new problem keyword X_train missing

My data is 7352x128x9 ndarray of purely numerical values (9 time series vectors of 128 length per each example).

Is setting validation_split to 0 or 1 going to temporarily fix the issue?

weir12 commented 4 years ago

Hi @le-dawg I have the same problem as you described. My data is the shape of [batch_sizes,time_steps,input_sizes] which is conforms to the RNN input data format. Have you solved the problem now? Thanks!

franchuterivera commented 3 years ago

With the new refactor code in the development branch, this is no longer a problem. For this reason, I am closing this old issue.