IndicoDataSolutions / Passage

A little library for text analysis with RNNs.
MIT License
530 stars 134 forks source link

fit_transform returns unorderable types: dict_values() >= int() error #26

Closed naeemulhassan closed 9 years ago

naeemulhassan commented 9 years ago

I just installed passage through git. Tried this small code.

example_text = ['This. is.', 'Example TEXT', 'is text'] tokenizer = Tokenizer(min_df=1, lowercase=True, character=False) tokenized = tokenizer.fit_transform(example_text)

It returns--

TypeError Traceback (most recent call last)

in () 5 example_text = ['This. is.', 'Example TEXT', 'is text'] 6 tokenizer = Tokenizer(min_df=1, lowercase=True, character=False) ----> 7 tokenized = tokenizer.fit_transform(example_text) 8 tokenized /home/naeemul/anaconda3/lib/python3.4/site-packages/passage/preprocessing.py in fit_transform(self, texts) 128 129 def fit_transform(self, texts): --> 130 self.fit(texts) 131 tokens = self.transform(texts) 132 return tokens /home/naeemul/anaconda3/lib/python3.4/site-packages/passage/preprocessing.py in fit(self, texts) 109 else: 110 tokens = [tokenize(text) for text in texts] --> 111 self.encoder = token_encoder(tokens, max_features=self.max_features-3, min_df=self.min_df) 112 self.encoder['PAD'] = 0 113 self.encoder['END'] = 1 /home/naeemul/anaconda3/lib/python3.4/site-packages/passage/preprocessing.py in token_encoder(texts, max_features, min_df) 54 df[token] = 1 55 k, v = df.keys(), np.asarray(df.values()) ---> 56 valid = v >= min_df 57 k = lbf(k, valid) 58 v = v[valid] TypeError: unorderable types: dict_values() >= int() I again installed through pip and got the same error. Any idea how to get it solved? Thanks!
Newmu commented 9 years ago

This might be a python3 incompatibility issue - can you try it using python 2.7 for instance?

naeemulhassan commented 9 years ago

Thanks! worked perfectly in python 2.7.