tensorflow / skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Apache License 2.0
3.18k stars 439 forks source link

DB pedia text classification shows " TypeError: data type not understood" #11

Closed s4sarath closed 8 years ago

s4sarath commented 8 years ago

All the passed Parameter shape is correct . i have cross validated that .

ilblackdragon commented 8 years ago

Can you please show a stack trace? And also which version of scikit learn do you have? It seems like older versions had the different semantics of array checking.

s4sarath commented 8 years ago

Give me half an hour...

s4sarath commented 8 years ago

TypeError Traceback (most recent call last)

in () ----> 1 classifier.fit(X_train, y_array) /Library/Python/2.7/site-packages/skflow/**init**.pyc in fit(self, X, y) 111 """ 112 # Sets up data feeder. --> 113 self._setup_data_feeder(X, y) 114 if not self.continue_training or not self._initialized: 115 # Sets up model and trainer. /Library/Python/2.7/site-packages/skflow/**init**.pyc in _setup_data_feeder(self, X, y) 69 else: 70 self._data_feeder = data_feeder.DataFeeder(X, y, ---> 71 self.n_classes, self.batch_size) 72 73 def _setup_training(self): /Library/Python/2.7/site-packages/skflow/data_feeder.pyc in **init**(self, X, y, n_classes, batch_size) 60 def **init**(self, X, y, n_classes, batch_size): 61 self.X = check_array(X, ensure_2d=False, ---> 62 allow_nd=True, dtype=[np.float32, np.int64]) 63 self.y = check_array(y, ensure_2d=False, dtype=np.float32) 64 self.n_classes = n_classes /Library/Python/2.7/site-packages/sklearn/utils/validation.pyc in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features) 342 else: 343 dtype = None --> 344 array = np.array(array, dtype=dtype, order=order, copy=copy) 345 # make sure we actually converted to numeric: 346 if dtype_numeric and array.dtype.kind == "O": TypeError: data type not understood sklearn version = 0.16.1
ilblackdragon commented 8 years ago

It seems like the data type of X_train is not understood by check_array. Can you print the type(X_train) and X_train.dtype? It supposes to be a 2-dimensional numpy matrix of dtype.float32. And you are running the same code as in text_classification.py or modified - if yes, can you show what you modified?

s4sarath commented 8 years ago

type(X_train) => numpy.ndarray X_train.dtype => dtype('int64')

s4sarath commented 8 years ago

@ilblackdragon - I have converted X_train as you mentioned by , X_train.astype('float32') . But still the same error . It is 2d array of size (560000, 10) .

s4sarath commented 8 years ago

@ilblackdragon - I am running the same code as in text_classififcation.py . As I am new to tensorflow , I thought if starting from the examples you have provided .

ilblackdragon commented 8 years ago

Ok, so I reproduced the issue with sklearn 0.16.1. Quick fix for you to update your sklearn to 0.17. I'll in a meanwhile try to find what is the difference.

s4sarath commented 8 years ago

Ok I will do that , now . Thanks .

s4sarath commented 8 years ago

@ilblackdragon - Still no help . I ugraded scikit-learn to 0.17 . Still it is giving the same error .

ylhsieh commented 8 years ago

I have encountered this problem, and upgrading to 0.17 did solve the problem. So, @s4sarath could you try to remove scikit-learn and re-install 0.17. Maybe there are multiple installations of scikit in your system and python is not picking up the latest version.

s4sarath commented 8 years ago

I have uninstalled using pip and upgrade using pip . let me try once more . On 27 Nov 2015 12:48, "ylhsieh" notifications@github.com wrote:

I have encountered this problem, and upgrading to 0.17 did solve the problem. So, @s4sarath https://github.com/s4sarath could you try to remove scikit-learn and re-install 0.17. Maybe there are multiple installations of scikit in your system and python is not picking up the latest version.

— Reply to this email directly or view it on GitHub https://github.com/google/skflow/issues/11#issuecomment-160057612.

s4sarath commented 8 years ago

Hi guys . it is working after updating sklearn to 0.17 . if anyone updating , please make sure that you have removed all the previous version from site-packages . On 27 Nov 2015 13:03, "sarath r nair" s4sarath@gmail.com wrote:

I have uninstalled using pip and upgrade using pip . let me try once more . On 27 Nov 2015 12:48, "ylhsieh" notifications@github.com wrote:

I have encountered this problem, and upgrading to 0.17 did solve the problem. So, @s4sarath https://github.com/s4sarath could you try to remove scikit-learn and re-install 0.17. Maybe there are multiple installations of scikit in your system and python is not picking up the latest version.

— Reply to this email directly or view it on GitHub https://github.com/google/skflow/issues/11#issuecomment-160057612.

ilblackdragon commented 8 years ago

FYI - I've fixed this issue and also started running tests with sklearn 0.16 (will add it for CI as well). Thanks for raising it!

s4sarath commented 8 years ago

Good thing man . thanks . On 28 Nov 2015 10:10, "Illia Polosukhin" notifications@github.com wrote:

Hey guy, just FYI - I've fixed this issue and also started running tests with sklearn 0.16 (will add it for CI as well). Thanks for raising it!

— Reply to this email directly or view it on GitHub https://github.com/google/skflow/issues/11#issuecomment-160248271.