Loaded maxEntTestCorpus
Classify: Bloomberg –He's the man of the Year!
/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py:37: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if t in stopwords:
/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py:275: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if word[-1] == 's':
Traceback (most recent call last):
File "toolbox/shell-classifier.py", line 34, in
features = config.classifier_tokenizer.getFeatures(txt)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 144, in getFeatures
return dict.fromkeys(cls.getClassifierTokens(text), 1)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 131, in getClassifierTokens
tokes = cls.stemm(tokes)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 152, in stemm
tokens[i] = stemmer.stem(t)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 633, in stem
stem = self.stem_word(word.lower(), 0, len(word) - 1)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 591, in stem_word
word = self._step1ab(word)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 289, in _step1ab
if word.endswith("ied"):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
Feneks-MacBook-Pro:streamcrab fenek$ python toolbox/shell-classifier.py maxEntTestCorpus
exit: ctrl+c
Loaded maxEntTestCorpus Classify: Bloomberg –He's the man of the Year! /Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py:37: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if t in stopwords: /Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py:275: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if word[-1] == 's': Traceback (most recent call last): File "toolbox/shell-classifier.py", line 34, in
features = config.classifier_tokenizer.getFeatures(txt)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 144, in getFeatures
return dict.fromkeys(cls.getClassifierTokens(text), 1)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 131, in getClassifierTokens
tokes = cls.stemm(tokes)
File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 152, in stemm
tokens[i] = stemmer.stem(t)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 633, in stem
stem = self.stem_word(word.lower(), 0, len(word) - 1)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 591, in stem_word
word = self._step1ab(word)
File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 289, in _step1ab
if word.endswith("ied"):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
Feneks-MacBook-Pro:streamcrab fenek$ python toolbox/shell-classifier.py maxEntTestCorpus
exit: ctrl+c
Loaded maxEntTestCorpus