Ramprasad-Group / polygnn

polyGNN is a Python library to automate ML model training for polymer informatics.
Other
32 stars 5 forks source link

ValueError: Found input variables with inconsistent numbers of samples: [293, 874]; trying to tune HP on subset of data #7

Closed kietbphan closed 1 year ago

kietbphan commented 1 year ago

Hi, Im trying to tune my model HP using a subset of my data and then train on the whole dataset. I am running into this issue of 'inconsistent numbers of samples:'. In the code i changed where it begins to '# split train and val data'; I changed the split to split on the subset (master_data) instead of group data.

xx
kietbphan commented 1 year ago

Fixed by creating a subset DF group_data1, and using it for 'find optimal capacity' and 'do hparams opt' section.

rishigurnani commented 1 year ago

Hi @kietbphan, it looks like your error is not from polygnn but from sklearn. Perhaps it is because master_data and group_train_data.prop are not of equal length.