dogsheep / twitter-to-sqlite

Save data from Twitter to a SQLite database
Apache License 2.0
402 stars 21 forks source link

Command for running a search and saving tweets for that search #3

Closed simonw closed 5 years ago

simonw commented 5 years ago
$ twitter-to-sqlite search dogsheep
simonw commented 5 years ago

https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets

simonw commented 5 years ago

Just importing tweets here isn't enough - how are we supposed to know which tweets were imported by which search?

So I think the right thing to do here is to also create a search_runs table, which records each individual run of this tool (with a timestamp and the search terms used). Then have a search_runs_tweets m2m table which shows which Tweets were found by that search.

simonw commented 5 years ago

I have a working command now. I'm going to ship it early because it could do with some other people trying it out.

simonw commented 5 years ago

It would be neat if this could support --since, with that argument automatically finding the maximum tweet ID from a previous search that used the same exact arguments (using the search_runs table).

simonw commented 5 years ago

I'm going to add a hash column to search_runs to support that. It's going to be the sha1 hash of the key-ordered JSON of the search arguments used by that run. Then --since can look for an identical hash and use it to identify the highest last fetched tweet to use in since_id.

simonw commented 5 years ago

Documented here: https://github.com/dogsheep/twitter-to-sqlite/blob/801c0c2daf17d8abce9dcb5d8d610410e7e25dbe/README.md#running-searches