Data collection per day

To download data from Twitter and MyNews, the pipeline works as follow:

Check if in keywords collections there is record without the key search_twitter_key (or search_mynews_key)
If results, check the date key and if date < today - days_after. It means, as the data collection need to be done x days before the fact-check date and x days after the fact-check, it has to wait long enough to be able to collect the x days after the fact check.
If true, then it does a query for each days from x days before the claim to x days after the claim).
Record the data
Set the search_twitter_key in the keywords collection with the date of data collection

The advantage is that the pipeline will be able to rerun after a crash without missing some days as the two conditions (not search_key present and the date + days_after < today) can be rechecked at any time. The problem with that approach is all the data collected is done in the past. While not problematic for Twitter. It may create some issues in Mynews (See #47). Ideally, while retaining the advantage of the current methods, it should be able to perform data collection as soon as a claim is recorded in the keywords collection (or maybe a day after) and continue the data collection until days_after date is reached.

TeMU-BSC / iberifier

Data collection per day #55