abjer / sds2019

Social Data Science 2019 - a summer school course
https://abjer.github.io/sds2019
46 stars 96 forks source link

Snorre's "Log" implementation API + Analysis of datalog #39

Open Myskovgaard opened 5 years ago

Myskovgaard commented 5 years ago

Hi Everybody,

We have two questions in regards to Snorre's datalog file:

  1. Logging file with API: We have tried to implement the 'logging code' from lecture 8 into our function that extracts tweets but it does not save anything to the file. Our function goes through Twitters API to extract the data and not a URL. We do not use the requests module, we use the tweepy function:
auth = tweepy.OAuthHandler(consumerKey, consumerSecret)
auth.set_access_token(accessToken, accessTokenSecret)
api = tweepy.API(auth) 

Maybe you can give us an idea of how to implement it correctly?

  1. Describing the data (Maybe this one is more directed at Snorre): We understand that the log file will log errors and save timestamps etc. How do you want us to analyze the data?

Kindest regards, Group 10

snorreralund commented 5 years ago
  1. If you are using tweepy you have to create your own log. If you want to use the connector you need to create your own queries. e.g.:

    
    # import authorization package
    from requests_oauthlib import OAuth1

load credentials

import pickle consumer_key, consumer_secret, oauth_token, oauth_token_secret = pickle.load(open('/path/to/twitter_credentials.pkl','rb'))

initialize authorization mechanism

auth = OAuth1(consumer_key, consumer_secret, oauth_token, oauth_token_secret)

search tweets example query

url ='https://api.twitter.com/1.1/search/tweets.json?q=foucault&geocode=55.676098,12.568337,5km'

usertimeline tweets example query

name = 'realdonaldtrump' q = 'https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=realdonaldtrump&count=200&tweet_mode=extended'

Using the connector with authorization

call = {'url':q,'auth':auth} # define arguments to the requests.get method response,call_id = connector.get({'url':q,'auth':auth},'twittertest_auth') data = response.json()


2. I will create a little post about how you should analyze the Log later today.
snorreralund commented 5 years ago

The pip package is not updated yet, so until then you should use the following Connector: https://github.com/snorreralund/scraping_seminar/blob/master/logging_requests.py

snorreralund commented 5 years ago

see issue #41