Open ChaitanyaBaweja opened 4 years ago
Hi,
That conversion was done for privacy and by the original publishers of the dataset, so we can't get the real handle from it.
The datasets are quite old, so I'm not sure if you'll get a lot from them. You can download data from the country you are interested in using this repo https://github.com/afshinrahimi/twitter-fetcher, set the search criteria in tweepy to download geolocated tweets from bounding box of the country you're interested in to collect many geolocated tweets. Then for each user in the downloaded tweets, download their timeline, and use the location as label. Finally, when you have enough users with locations, build your dataset.
Your data uses these identifiers for users: USER_ee551c6c.
These don't correspond to the type of id's you get from twitter. How do you convert the twitter id's in this format. I am asking this because I need to augment my data to your dataset and would use a similar conversion for my data as well.