adimajo / glmdisc_python

glmdisc Python package: discretization, factor level grouping, interaction discovery for logistic regression
GNU General Public License v3.0
6 stars 1 forks source link

inconsistent number of samples in fit for training test split #3

Closed Davidoreilly12 closed 4 years ago

Davidoreilly12 commented 5 years ago

ValueError Traceback (most recent call last)

in 1 import numpy as np ----> 2 fit(disc,predictors_cont=X.values,labels=Y.values,predictors_qual=None) in fit(self, predictors_cont, predictors_qual, labels) 190 X=current_encoder_emap.transform( 191 emap[train, :].astype(str))), --> 192 normalize=False) 193 set_trace() 194 if self.validation: c:\python37\lib\site-packages\sklearn\metrics\classification.py in log_loss(y_true, y_pred, eps, normalize, sample_weight, labels) 2119 """ 2120 y_pred = check_array(y_pred, ensure_2d=False) -> 2121 check_consistent_length(y_pred, y_true, sample_weight) 2122 2123 lb = LabelBinarizer() c:\python37\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays) 203 if len(uniques) > 1: 204 raise ValueError("Found input variables with inconsistent numbers of" --> 205 " samples: %r" % [int(l) for l in lengths]) 206 207 ValueError: Found input variables with inconsistent numbers of samples: [45, 75]
adimajo commented 4 years ago

Should be OK given recent improvements and tests, feel free to reopen.