Closed jwijffels closed 6 years ago
I could provide you with a vector of status_id
's that you could use as a data set. Twitter's TOS says not to share the data beyond the ID. But with the IDs, users could lookup the data via rtweet::lookup_tweets()
. Would tweets from verified vs non-verified users work? Or do you need more response options or a more continuous outcome (like fav/retweet counts)?
Thanks for the feedback. Wasn't aware of the Twitter TOS. That will be a blocking factor. I was basically looking for a ready-made dataset containing tweets such that I don't need to let the R package depend on another package. I'll look to another example then. Thank you for your input either way!
Perhaps anonymize the data? Or maybe tweets from public figuresāe.g., predicting whether Trump tweet was sent from iPhone, Android, and/or other; predict partisanship of the account tweeter; etc.?
Yes, that's exactly possible with that package but I would need a CC0 dataset in .RData format such that I can easily include it in the R package.
Hi @mkearney My apologies if this is not the right place for asking this question. I'm developing an R wrapper around Starspace here: https://github.com/bnosac/ruimtehol In order to build an example of a classification model, I was thinking on doing it on tweets in order to categorise the hashtag of a tweet. Do you happen to have a .RData file containing tweets available which can be used to incorporate in that package or do you - par hasard - know of a place where I can find such data?