I'm currently using the CSV data configuration example to do mulit-label binary classification (train_classification.py) on my dataset of choice. The dataset has many tasks and NaNs, but is featurizing properly. However, when it tries to compute ROC-AUC on the test set after 1 epoch, it runs into a ValueError when checking the y_true array. I've checked that all my columns/tasks are not just NaNs, such that y_true has some value. However, for some reason y_true has torch.Size([0]) for a specific column in my dataframe and so it is throwing a ValueError. Stack trace is provided below:
Dataframe task index: 498
Printout of y_true.shape: torch.Size([0])
Traceback (most recent call last):
File "classification_train.py", line 219, in
main(args, exp_config, train_set, val_set, test_set)
File "classification_train.py", line 94, in main
val_score = run_an_eval_epoch(args, model, val_loader)
File "classification_train.py", line 56, in run_an_eval_epoch
return np.mean(eval_meter.compute_metric(args['metric']))
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 342, in compute_metric
return self.roc_auc_score(reduction)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 277, in roc_auc_score
return self.multilabel_score(score, reduction)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 183, in multilabel_score
task_score = score_func(task_y_true, task_y_pred)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 276, in score
return roc_auc_score(y_true.long().numpy(), torch.sigmoid(y_pred).numpy())
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/sklearn/metrics/_ranking.py", line 550, in roc_auc_score
y_true = check_array(y_true, ensure_2d=False, dtype=None)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/sklearn/utils/validation.py", line 931, in check_array
raise ValueError(
ValueError: Found array with 0 sample(s) (shape=(0,)) while a minimum of 1 is required.
Hi all,
I'm currently using the CSV data configuration example to do mulit-label binary classification (train_classification.py) on my dataset of choice. The dataset has many tasks and NaNs, but is featurizing properly. However, when it tries to compute ROC-AUC on the test set after 1 epoch, it runs into a ValueError when checking the y_true array. I've checked that all my columns/tasks are not just NaNs, such that y_true has some value. However, for some reason y_true has torch.Size([0]) for a specific column in my dataframe and so it is throwing a ValueError. Stack trace is provided below:
Dataframe task index: 498 Printout of y_true.shape: torch.Size([0])
Traceback (most recent call last): File "classification_train.py", line 219, in
main(args, exp_config, train_set, val_set, test_set)
File "classification_train.py", line 94, in main
val_score = run_an_eval_epoch(args, model, val_loader)
File "classification_train.py", line 56, in run_an_eval_epoch
return np.mean(eval_meter.compute_metric(args['metric']))
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 342, in compute_metric
return self.roc_auc_score(reduction)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 277, in roc_auc_score
return self.multilabel_score(score, reduction)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 183, in multilabel_score
task_score = score_func(task_y_true, task_y_pred)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/dgllife/utils/eval.py", line 276, in score
return roc_auc_score(y_true.long().numpy(), torch.sigmoid(y_pred).numpy())
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/sklearn/metrics/_ranking.py", line 550, in roc_auc_score
y_true = check_array(y_true, ensure_2d=False, dtype=None)
File "/anaconda/envs/dgllife/lib/python3.8/site-packages/sklearn/utils/validation.py", line 931, in check_array
raise ValueError(
ValueError: Found array with 0 sample(s) (shape=(0,)) while a minimum of 1 is required.
Thanks for your help, Seyone