Hold-Krykke / PythonExam

4. Semester Python Eksamens Projekt
1 stars 1 forks source link

4. Semester Python Exam

Table of contents

Made by:

Purpose of the program

We would like to delve deeper into text analysis and web scraping.

We scrape data from Twitter, based on hashtag searches, and use different techniques to clean, analyze and present the data.

Example tweets to perform sentiment analysis on could be:

Technologies

Things that we didn't implement but would have liked to:

Using the program

  1. Clone the repo and follow the instructions in setup.ipynb

Note: Not all plots work with all data. A few cases might result in bad output.

Using the program with Flask

Starting the server

Using the endpoint

The server exposes a single endpoint /api/sentiment where you have to make all your requests. Use Postman or a similar tool to test the server at http://localhost:5000/api/sentiment - we have not deployed the server. There is no UI for the server so every request has to be made in a tool like Postman. (Showing examples from Postman)

Explanation of search options Data gathering

Data filtering

Overall Recommendation

JSON: { "hashtags": [ "trump", "biden" ], "start_date": "2020-5-12", "remove_sentiment": "Uncertain", "end_date": "2020-5-22", "plot_type": "line", "tweet_amount": 300 }


Using the program with CLI

  1. In the root folder, run python app.py -h to print the help output:

Default values

All the optional arguments have default values.
The program can run using all default values by simply passing the hashtags you want to gather info from.

Examples

Utilizing default values to search for the hashtags #trump and #biden:
python app.py trump biden
This would run the program using the following values:

{'certainty_high': 0.75,
 'certainty_low': 0.25,
 'date': [datetime.date(2020, 5, 22),
          datetime.date(2020, 5, 27)],
 'fresh_search': False,
 'hashtags': ['trump', 'biden'],
 'plot_type': 'pie',
 'remove_sentiment': None,
 'save_plot': False,
 'search_hashtags': None,
 'search_mentions': None,
 'search_urls': None,
 'tweet_count': 300}

Date by default is set to current day + 5 days

Changing plot type and filtering on dates (hashtags omitted for brevity)
python app.py -p bar -d 2020-06-01 2020-06-02 or
python app.py --plot bar --date 2020-06-01 2020-06-02

Search for a specific amount of tweets (1000) and save the generated plots locally (hashtags omitted for brevity)
python app.py -s -c 1000 or
python app.py --save --count 1000