shirosaidev / stocksight

Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
https://shirosaidev.github.io/stocksight/
Apache License 2.0
2.12k stars 463 forks source link

sentiment.py fails #8

Closed ictinc closed 5 years ago

ictinc commented 5 years ago

Hi there, I would really like to try out you program. I have everything installed as described but after running; python sentiment.py -k TSLA,'Elon Musk',Musk,Tesla --debug

I receive the following error:

`2019-02-02 02:19:54,186 [WARNING][stocksight] Exception: exception caused by:


Resource punkt not found. Please use the NLTK Downloader to obtain the resource:

import nltk nltk.download('punkt')

Attempted to load tokenizers/punkt/english.pickle

Searched in:

Traceback (most recent call last): File "sentiment.py", line 796, in stream.filter(track=keywords, languages=['en']) File "/home/user/.local/lib/python2.7/site-packages/tweepy/streaming.py", line 453, in filter self._start(is_async) File "/home/user/.local/lib/python2.7/site-packages/tweepy/streaming.py", line 368, in _start self._run() File "/home/user/.local/lib/python2.7/site-packages/tweepy/streaming.py", line 300, in _run six.reraise(*exc_info) File "/home/user/.local/lib/python2.7/site-packages/tweepy/streaming.py", line 269, in _run self._read_loop(resp) File "/home/user/.local/lib/python2.7/site-packages/tweepy/streaming.py", line 331, in _read_loop self._data(next_status_obj) File "/home/user/.local/lib/python2.7/site-packages/tweepy/streaming.py", line 303, in _data if self.listener.on_data(data) is False: File "sentiment.py", line 127, in on_data tokens = nltk.word_tokenize(text_for_tokens) File "/home/user/.local/lib/python2.7/site-packages/nltk/tokenize/init.py", line 143, in word_tokenize sentences = [text] if preserve_line else sent_tokenize(text, language) File "/home/user/.local/lib/python2.7/site-packages/nltk/tokenize/init.py", line 104, in sent_tokenize tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language)) File "/home/user/.local/lib/python2.7/site-packages/nltk/data.py", line 868, in load opened_resource = _open(resource_url) File "/home/user/.local/lib/python2.7/site-packages/nltk/data.py", line 993, in open return find(path, path + ['']).open() File "/home/user/.local/lib/python2.7/site-packages/nltk/data.py", line 699, in find raise LookupError(resource_not_found) LookupError:


Resource punkt not found. Please use the NLTK Downloader to obtain the resource:

import nltk nltk.download('punkt')

Attempted to load tokenizers/punkt/english.pickle

Searched in:

I have issued: pip install nltk and even issued: apt-get install python-numpy python-nltk

And tried NLTK manually:

` Python 2.7.15rc1 (default, Nov 12 2018, 14:31:15) [GCC 7.3.0] on linux2 Type "help", "copyright", "credits" or "license" for more information.

Import nltk File "", line 1 Import nltk ^ SyntaxError: invalid syntax ` I'm not sure what I'm missing here, and hope you can help me get everything running as it's supposed to. I'm running Ubuntu 18.04 with the latest updates. Kind regards, Ronald.

AshyIsMe commented 5 years ago

It looks like you have a capital I in Import ntlk.

Try the following manually being careful to keep the cases the same:

import nltk
nltk.download('punkt')

If that succeeds then you should be able to run sentiment.py as you were trying previously.

(Note: I'm not the author of this project but was testing it recently and hit the same issue)

ictinc commented 5 years ago

Thanks for your reply, this was indeed what was wrong. I started searching google for methods of installing NLTK and copy and paste it from a website, not noticing the capital casing. :(

Unfortunately I do find myself in a new situation with the following error message:

2019-02-02 04:20:37,925 [INFO][stocksight] Writing twitter user ids to text file ./twitteruserids.txt Traceback (most recent call last): File "sentiment.py", line 766, in <module> f = open(twitter_users_file, "wt", encoding='utf-8') TypeError: 'encoding' is an invalid keyword argument for this function

Unfortunately I'm not so familiar in python to be able to solve these kinds of errors. I tried to delete the: , encoding='utf-8' part, as I thought maybe it's obsolete, but in fact creates new problems with parsing the tweets. :)

AshyIsMe commented 5 years ago

Try with python3:

sudo apt install -y python3-pip

pip3 install -r requirements.txt
python3 sentiment.py ...
ictinc commented 5 years ago

Thank you very much for your repsonse. This seems to have solved everything. :)

shirosaidev commented 5 years ago

thanks @AshyIsMe , @ictinc Python 3.x is required https://github.com/shirosaidev/stocksight#requirements