cyhex / streamcrab

Real-Time, Twitter sentiment analyzer engine
http:/www.streamcrab.com
144 stars 49 forks source link

Reacting to special signs in Tweets #24

Open riccitensor opened 8 years ago

riccitensor commented 8 years ago

Loaded maxEntTestCorpus Classify: Bloomberg –He's the man of the Year! /Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py:37: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if t in stopwords: /Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py:275: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if word[-1] == 's': Traceback (most recent call last): File "toolbox/shell-classifier.py", line 34, in features = config.classifier_tokenizer.getFeatures(txt) File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 144, in getFeatures return dict.fromkeys(cls.getClassifierTokens(text), 1) File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 131, in getClassifierTokens tokes = cls.stemm(tokes) File "/Users/fenek/Documents/pp/pingpongowl/streamcrab/smm/classifier/textprocessing.py", line 152, in stemm tokens[i] = stemmer.stem(t) File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 633, in stem stem = self.stem_word(word.lower(), 0, len(word) - 1) File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 591, in stem_word word = self._step1ab(word) File "/Users/fenek/Applications/anaconda/anaconda/lib/python2.7/site-packages/nltk/stem/porter.py", line 289, in _step1ab if word.endswith("ied"): UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) Feneks-MacBook-Pro:streamcrab fenek$ python toolbox/shell-classifier.py maxEntTestCorpus exit: ctrl+c

Loaded maxEntTestCorpus