congosto / t-hoarder_kit

Set of basic tools to extract data from the Twitter API and visualize graphs
63 stars 21 forks source link

t-hoarder_kit

Set of basic tools to extract data from the Twitter API and visualize graphs

Installation

Recommended to clone with git, so versions are automatically updated

git clone https://github.com/congosto/t-hoarder_kit

Dependencies: tweepy

Python 2.7.12 or newer. Python 3.x not supported


Could be used as an alternative this Dropbox Virtual Machine with all software installed (The VM has a size almost 4 GB so it is recommended to install it from a high speed connection)

If VM is used, reading this use case is recommended How to use t-hoarder_kit with the Virtual Machine Includes how to analyze network relationships with gephi


Data enviroment

T-hoarder kit works on the following directory structure

 T-hoarder_kit --+-- keys (app keys and users access token for oauth authentication in the API)
                 |
                 +- scripts ( scripts in Python)
                 |
                 +--resources (Some information needed to process data)
                 |
                 +- store -+-- experiment-1 ( A directory for each experiment so the data is not mixed)
                           |
                           +-- experiment-2
                           ....
                           |
                           +-- experiment-n

Assume that the access keys are in the keys directory and the results are deposited in the store/experiment directory

Execution Environments

Windows:

Linux

t_hoarder_kit.bat (Windows) and t_hoarder_kit.sh (linux) provide this menu for access to python scripts.

  1. Get a user token access
  2. Get users information (profile | followers | following | relations | tweets | h_index)
  3. Make a query on Twitter
  4. Get tweets in real time
  5. Generate the declared relations graph (followers or following or both)
  6. Generate the dynamic relations graph (RT | reply | mentions)
  7. Processing tweets (sort |entities| classify| users | spread)
  8. Exit

For more information, visit the wiki

Using the python scripts from the command line the keys and the results is free to place them where you want

  tweet_auth.py [-h] keys_app user

  tweet_rest.py [-h] [--id_user] [--fast]
                 (--profile | --followers | --following | --relations | --connections | --tweets | --h_index)
                 keys_app keys_user file_users

  usage: tweet_search.py [-h] [--query QUERY] [--file_out FILE_OUT]
                   [--format FORMAT]
                   keys_app keys_user

  tweet_streaming.py [-h] [--users USERS] [--words WORDS]
                      [--locations LOCATIONS]
                      app_keys user_keys dir_out file_dest

  tweets_grafo.py [-h] [--top_size TOP_SIZE] (--RT | --mention | --reply)
                   file_in

  tweets_entity.py [-h] [--top_size TOP_SIZE] [--TZ TZ]
                    file_in path_experiment path_resources

  tweets_classify.py [-h] file_in file_topics path_experiment

  users_types.py [-h] file_in path_experiment

  tweets_spread.py [-h] [--top_size TOP_SIZE] [--TZ TZ]
                    file_in path_experiment

  user_klout.py [-h] file_users APIkey