Open jzohrab opened 4 years ago
When this change is made, we'll need to somehow force a full scrape of all of the changed scrapers, because the priority is recorded in the underlying dynamoDB record. Issue https://github.com/covidatlas/li/issues/196 is required for this.
NOTE: we need to implement https://github.com/covidatlas/li/issues/196 as a pre-requisite for this to update the dynamoDB records.
While solving issue #313 ("tested starting on 2020-07-09"), I saw that many states are almost exclusively using
us-covidtracking
as their data source. I pulled down the file https://liproduction-reportsbucket-bhk8fnhv1s76.s3-us-west-1.amazonaws.com/beta/latest/timeseries-byLocation.json, and got the below results:by using this script:
Many other states are similar.
This means that covidtracking is overriding our state-specific scrapers at the moment. e.g., for Wyoming (us-wy), the us/wy source is only used as the source for a single day!
covidtracking has priority 0.5, but many state scrapers have priorities (after updating master, run
npm run list-sources
to get the following):