digitalmethodsinitiative / dmi-tcat

Digital Methods Initiative - Twitter Capture and Analysis Toolset
Apache License 2.0
367 stars 114 forks source link

GNIP script #36

Closed joelgombin closed 10 years ago

joelgombin commented 10 years ago

Hello,

I'd very much like to try your GNIP import script. Thanks!

ErikBorra commented 10 years ago

Hi Joel,

I verified and commited the GNIP import script in commit https://github.com/digitalmethodsinitiative/dmi-tcat/commit/b31d2606cf1ad3b07ae2967b822b8472d5188b5b

To use the script, edit dmi-tcat/import/import-gnip.php and specify a $bin_name and the $dir at which your GNIP JSON files resides. Save the changes, and run php import-gnip.php from the command line. Keep in mind that GNIP does not seem to provide the in_reply_to_status_id field and not always provides a retweet_id field. Some DMI-TCAT analyses/exports might thus not yield results.

Best,

Erik

joelgombin commented 10 years ago

Thanks a lot Erik! Indeed, there is no in_reply_to_status_id field, but it could actually be inferred from the inReplyTo field by extracting the user ID. Maybe overkill though?

ErikBorra commented 10 years ago

Hi Joe,

I chose not to do so as the in_reply_to_status_id field is supposed to be a tweet id, and not a user id. Mentions are already stored in the _mentions table in DMI-TCAT.

Best,

Erik