lenguyenthedat / kaggle-for-fun

All my submissions for Kaggle contests that I have been, and going to be participating.
39 stars 37 forks source link

Issue with the Cabin feature #11

Open btphan95 opened 6 years ago

btphan95 commented 6 years ago

Running LabelEncoder on the Cabins feature gives an error:

TypeError                                 Traceback (most recent call last)
<ipython-input-121-48f3aad5f78e> in <module>()
      4     print(col)
      5     le.fit(list(train[col]) + list(cv[col]))
----> 6     train[col] = le.transform(train[col])
      7     cv[col] = le.transform(cv[col])

/opt/conda/lib/python3.6/site-packages/sklearn/preprocessing/label.py in transform(self, y)
    128         y = column_or_1d(y, warn=True)
--> 130         classes = np.unique(y)
    131         if len(np.intersect1d(classes, self.classes_)) < len(classes):
    132             diff = np.setdiff1d(classes, self.classes_)

/opt/conda/lib/python3.6/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
    208     ar = np.asanyarray(ar)
    209     if axis is None:
--> 210         return _unique1d(ar, return_index, return_inverse, return_counts)
    211     if not (-ar.ndim <= axis < ar.ndim):
    212         raise ValueError('Invalid axis kwarg specified for unique')

/opt/conda/lib/python3.6/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
    275         aux = ar[perm]
    276     else:
--> 277         ar.sort()
    278         aux = ar
    279     flag = np.concatenate(([True], aux[1:] != aux[:-1]))

TypeError: '>' not supported between instances of 'float' and 'str'

It looks like the reason is because there are missing values in the Cabins feature. How did you overcome this?

btphan95 commented 6 years ago

The same issue comes up for the Embarked feature.

lenguyenthedat commented 6 years ago

I don't think it is supporting python 3 yet :)