kaustav202 / RealTime-TwitterDataAnalysis

Collect and process real time twitter data plotting various metrics like volume , proportion, sentiment. Analyze tweet node networks and map them geographically.
29 stars 15 forks source link

add tweet count logger to stream.py #14 #23

Closed Coder-Manan closed 2 years ago

Coder-Manan commented 2 years ago

A similar PR may already be submitted! Please search among the Pull request before creating one.

Thanks for submitting a pull request! Please provide enough information so that others can review your pull request:

For more information, see the CONTRIBUTING guide.

Explain the motivation for making this change. What existing problem does the pull request solve?

Test plan (required) No plan Demonstrate the code is solid. Example: The exact commands you ran and their output, screenshots / videos if the pull request changes UI.

if (self.counter % 1000 == 0):
            print("No of tweets currently in tweets.json = ", self.counter)

Code formatting

Closing issues

Fixes #14

Coder-Manan commented 2 years ago

Currently I am printing the number of tweets after every thousand writes

kaustav202 commented 2 years ago

Also add logging on smaller count like 50.. it takes a long time to reach 1000 tweets

Coder-Manan commented 2 years ago

Also add logging on smaller count like 50.. it takes a long time to reach 1000 tweets

Fixed it

Coder-Manan commented 2 years ago

I am getting the following error when trying to run the app

Traceback (most recent call last):
  File "C:\Python310\lib\site-packages\pandas\core\indexes\base.py", line 3800, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'text'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\xxxx\xxxx\RealTime-TwitterDataAnalysis\app\main.py", line 29, in <module>
    python = check_word_in_tweet('#google', ds_tweets)
  File "D:\xxxx\xxxx\RealTime-TwitterDataAnalysis\app\main.py", line 16, in check_word_in_tweet
    contains_column = data['text'].str.contains(word, case = False)
  File "C:\Python310\lib\site-packages\pandas\core\frame.py", line 3805, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Python310\lib\site-packages\pandas\core\indexes\base.py", line 3802, in get_loc
    raise KeyError(key) from err
KeyError: 'text'
kaustav202 commented 2 years ago

I am getting the following error when trying to run the app

Traceback (most recent call last):
  File "C:\Python310\lib\site-packages\pandas\core\indexes\base.py", line 3800, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'text'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\xxxx\xxxx\RealTime-TwitterDataAnalysis\app\main.py", line 29, in <module>
    python = check_word_in_tweet('#google', ds_tweets)
  File "D:\xxxx\xxxx\RealTime-TwitterDataAnalysis\app\main.py", line 16, in check_word_in_tweet
    contains_column = data['text'].str.contains(word, case = False)
  File "C:\Python310\lib\site-packages\pandas\core\frame.py", line 3805, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Python310\lib\site-packages\pandas\core\indexes\base.py", line 3802, in get_loc
    raise KeyError(key) from err
KeyError: 'text'

Check your tweets.json file.. most likely it has no data. you need to run stream.py and then run main after a gap

kaustav202 commented 2 years ago

Also fetch the latest from upstream before proceeding

Coder-Manan commented 2 years ago

Ok. I will look into it today

Coder-Manan commented 2 years ago

Please check if logging is happening now or not, I have changed the location where I was printing the counter

Coder-Manan commented 2 years ago

@kaustav202