KaiDMML / FakeNewsNet

This is a dataset for fake news detection research
1.11k stars 432 forks source link

Downloads the news.json, but not the other entities such as tweets, retweets etc. #16

Open anubhakabra opened 5 years ago

anubhakabra commented 5 years ago

The code exactly downloads the news.json. However it is unable to retrieve tweets, retweets etc, according to the hierarchy shown. The twitter keys were generated and used as given.

sivacharanreddy commented 5 years ago

I'm facing the same issue

anubhakabra commented 5 years ago

If you find any solution, do let me know.

Regards

On Thu, May 30, 2019 at 5:01 PM Siva Charan Reddy notifications@github.com wrote:

I'm facing the same issue

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KaiDMML/FakeNewsNet/issues/16?email_source=notifications&email_token=AKVO6SOMOCFO5CLA6W6XYALPX63KBA5CNFSM4HMIAAB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWSC76Y#issuecomment-497299451, or mute the thread https://github.com/notifications/unsubscribe-auth/AKVO6SO4YWDI6PL7VXNKZF3PX63KBANCNFSM4HMIAABQ .

sivacharanreddy commented 5 years ago

In my case, the tweets are not being retrieved because of this exception twython.exceptions.TwythonAuthError: Twitter API returned a 400 (Bad Request), Bad Authentication data. Trying to fix it. Please check your logs to figure out the issue.

mdepak commented 5 years ago

@sivacharanreddy This can be due to some mistake in the way the twitter keys are provided as input. Please provide the twitter keys in tweet_keys_file.txt in the following format

app_key,app_secret,oauth_token,oauth_token_secret
xxx1,xxx2,xxx3,xx4

This is the comma separated file where the first line is header and the remaining lines indicate the keys. If you have multiple keys, then please provide them in separate lines.

sivacharanreddy commented 5 years ago

@mdepak Many thanks for the response. I have provided them in the mentioned order and also did check those keys using tweepy to invoke twitter search API. Please confirm if this is correct. I am assuming access_token == oauth_token and access_token_secret==oauth_token_secret.

mdepak commented 5 years ago

@sivacharanreddy The code makes use of Twython(https://twython.readthedocs.io/en/latest/api.html) library to collect tweets. Currently, all 4 parameters are passed to Twyhon API to authenticate. Please use all the keys from the pic to crawl and I will modify the code later to take OAuth1 or OAuth2 as a configuration.

twitter_keys
sivacharanreddy commented 5 years ago

@mdepak I have read the code and have understood the overall workflow. Thanks for the well commented code. I did use the keys as mentioned in the screenshot. I see that the code by default sets connection_mode=1 which does OAuth1(user auth) authentication. Inspite of doing everything as specified, I see this exception.

2019-06-08 03:08:48,902 150317 tweet_collection ERROR exception in collecting tweet objects Traceback (most recent call last): File "/home/xxxx/FakeNewsNet-master/code/tweet_collection.py", line 25, in dump_tweet_information tweet_object = twython_connector.get_twython_connection(Constants.GET_TWEET).show_status(id=tweet.tweet_id) File "/home/xxxx/.local/lib/python3.6/site-packages/twython/endpoints.py", line 94, in show_status return self.get('statuses/show/%s' % params.get('id'), params=params) File "/home/xxxx/.local/lib/python3.6/site-packages/twython/api.py", line 270, in get return self.request(endpoint, params=params, version=version) File "/home/xxxx/.local/lib/python3.6/site-packages/twython/api.py", line 264, in request api_call=url) File "/home/xxxx/.local/lib/python3.6/site-packages/twython/api.py", line 199, in _request retry_after=response.headers.get('X-Rate-Limit-Reset')) twython.exceptions.TwythonAuthError: Twitter API returned a 400 (Bad Request), Bad Authentication data.

sivacharanreddy commented 5 years ago

@sivacharanreddy This can be due to some mistake in the way the twitter keys are provided as input. Please provide the twitter keys in tweet_keys_file.txt in the following format

app_key,app_secret,oauth_token,oauth_token_secret
xxx1,xxx2,xxx3,xx4

This is the comma separated file where the first line is header and the remaining lines indicate the keys. If you have multiple keys, then please provide them in separate lines.

@mdepak I had a space after each comma while providing the keys in 'tweets_keys_file.txt', and that's the reason behind exception. It's now working. Thanks!

mdepak commented 5 years ago

@anubhakabra Can you provide more details like logs to further investigate this issue?

anubhakabra commented 5 years ago

Twitter API returned a 401 (Unauthorized), Could not authenticate you.

File "/home/anubha/anaconda3/envs/venv/lib/python3.6/site-packages/twython/api.py", line 199, in _request retry_after=response.headers.get('X-Rate-Limit-Reset')) twython.exceptions.TwythonAuthError: Twitter API returned a 401 (Unauthorized), Could not authenticate you. 2019-09-10 20:32:42,525 3817 tweet_collection ERROR exception in collecting tweet objects Traceback (most recent call last): File "/media/anubha/Elements/FakeNewsNet-master/code/tweet_collection.py", line 35, in dump_tweet_information map=True)['id'] File "/home/anubha/anaconda3/envs/venv/lib/python3.6/site-packages/twython/endpoints.py", line 105, in lookup_status return self.post('statuses/lookup', params=params) File "/home/anubha/anaconda3/envs/venv/lib/python3.6/site-packages/twython/api.py", line 274, in post return self.request(endpoint, 'POST', params=params, version=version) File "/home/anubha/anaconda3/envs/venv/lib/python3.6/site-packages/twython/api.py", line 264, in request api_call=url) File "/home/anubha/anaconda3/envs/venv/lib/python3.6/site-packages/twython/api.py", line 199, in _request retry_after=response.headers.get('X-Rate-Limit-Reset')) twython.exceptions.TwythonAuthError: Twitter API returned a 401 (Unauthorized), Could not authenticate you.

This is the error. I have used Tweepy and twython for generating outh keys (https://github.com/ryanmcgrath/twython)

anubhakabra commented 5 years ago

The data collection .out file shows

Resource id : 0

the keys are authorized and separated using comma in the second line of the file. the keys are not written within ' '. Should I be putting them within quotes?