bureaucratic-labs / dostoevsky

Sentiment analysis library for russian language
MIT License
312 stars 32 forks source link

StopIteration #77

Open hamelena opened 4 years ago

hamelena commented 4 years ago

Hi! I applied your sentiment model to a df column, at the beginning everything worked fine but few minutes before I got the RuntimeError: generator raised StopIteration. Do you have any idea why and now to fix it? Thank you in advance.

My input is: model = FastTextSocialNetworkModel(tokenizer=tokenizer) df_clean ['sentiment'] = df_clean ['prep_text'].apply(model.predict) df_clean

What I get is:

StopIteration Traceback (most recent call last) ~/anaconda3/lib/python3.7/site-packages/razdel/segmenters/tokenize.py in segment(self, parts) 299 def segment(self, parts): --> 300 buffer = next(parts) 301 for split in parts:

StopIteration:

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)

in 8 model = FastTextSocialNetworkModel(tokenizer=tokenizer) 9 ---> 10 df_clean ['sentiment'] = df_clean ['prep_text'].apply(model.predict) 11 df_clean ~/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds) 3846 else: 3847 values = self.astype(object).values -> 3848 mapped = lib.map_infer(values, f, convert=convert_dtype) 3849 3850 if len(mapped) and isinstance(mapped[0], Series): pandas/_libs/lib.pyx in pandas._libs.lib.map_infer() ~/anaconda3/lib/python3.7/site-packages/dostoevsky/models.py in predict(self, sentences, k) 82 Dict[str, float] 83 ]: ---> 84 X = self.preprocess_input(sentences) 85 Y = ( 86 self.model.predict(sentence, k=k) for sentence in X ~/anaconda3/lib/python3.7/site-packages/dostoevsky/models.py in preprocess_input(self, sentences) 76 ) 77 ) ---> 78 for sentence in sentences 79 ] 80 ~/anaconda3/lib/python3.7/site-packages/dostoevsky/models.py in (.0) 76 ) 77 ) ---> 78 for sentence in sentences 79 ] 80 ~/anaconda3/lib/python3.7/site-packages/dostoevsky/tokenization.py in split(self, text, lemmatize) 37 ]: 38 return [ ---> 39 (token.text.lower(), None) for token in regex_tokenize(text) 40 ] 41 ~/anaconda3/lib/python3.7/site-packages/dostoevsky/tokenization.py in (.0) 37 ]: 38 return [ ---> 39 (token.text.lower(), None) for token in regex_tokenize(text) 40 ] 41 ~/anaconda3/lib/python3.7/site-packages/razdel/substring.py in find_substrings(chunks, text) 16 def find_substrings(chunks, text): 17 offset = 0 ---> 18 for chunk in chunks: 19 start = text.find(chunk, offset) 20 stop = start + len(chunk) RuntimeError: generator raised StopIteration
dveselov commented 4 years ago

Hi, can you try to update razdel package?

$ pip install -U razdel
hamelena commented 4 years ago

Hi, many thanks! Now I get the error: 'float' object is not iterable

TypeError Traceback (most recent call last)

in () 8 model = FastTextSocialNetworkModel(tokenizer=tokenizer) 9 ---> 10 df['sentiment'] = df['prep_text'].apply(model.predict) 11 df /usr/lib/python3/dist-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds) 2549 else: 2550 values = self.asobject -> 2551 mapped = lib.map_infer(values, f, convert=convert_dtype) 2552 2553 if len(mapped) and isinstance(mapped[0], Series): pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer() /home/***/.local/lib/python3.6/site-packages/dostoevsky/models.py in predict(self, sentences, k) 82 Dict[str, float] 83 ]: ---> 84 X = self.preprocess_input(sentences) 85 Y = ( 86 self.model.predict(sentence, k=k) for sentence in X /home/***/.local/lib/python3.6/site-packages/dostoevsky/models.py in preprocess_input(self, sentences) 76 ) 77 ) ---> 78 for sentence in sentences 79 ] 80 TypeError: 'float' object is not iterable