covidatlas / li

Next-generation serverless crawler for COVID-19 data
Apache License 2.0
57 stars 33 forks source link

The Florida data is drifting significantly from the State dashboard. #469

Closed jzohrab closed 3 years ago

jzohrab commented 3 years ago

Original issue https://github.com/covidatlas/coronadatascraper/issues/1057, transferred here on Wednesday Jun 24, 2020 at 00:48 GMT


If there is anything I can do to help, I could do an hour or so a day.

jzohrab commented 3 years ago

(Transferred comment)

Hi @jsomer , thanks very much! Which data are you looking at, and what is the URL for the state dashboard? The source code for the scraper is in src/shared/scrapers/us/fl, or something like that.

Currently, I'm working on migrating away from this project to Li, see https://github.com/covidatlas/li/issues/235 for more details, so data issues are falling by the wayside until that is implemented.

The source and scraper in Li is in https://github.com/covidatlas/li/blob/master/src/shared/sources/us/fl/index.js. If you can suggest changes to the scraper here or in Li, that would be super.

Thanks for reporting the issue. Cheers! jz

jzohrab commented 3 years ago

(Transferred comment)

Thank you so much for this information.

Get Outlook for iOShttps://aka.ms/o0ukef


From: JZ notifications@github.com Sent: Wednesday, June 24, 2020 11:21:23 AM To: covidatlas/coronadatascraper coronadatascraper@noreply.github.com Cc: John Somerville john@johnlisa.us; Mention mention@noreply.github.com Subject: Re: [covidatlas/coronadatascraper] The Florida data is drifting significantly from the State dashboard. (#1057)

Hi @jsomerhttps://github.com/jsomer , thanks very much! Which data are you looking at, and what is the URL for the state dashboard? The source code for the scraper is in src/shared/scrapers/us/fl, or something like that.

Currently, I'm working on migrating away from this project to Li, see covidatlas/li#235https://github.com/covidatlas/li/issues/235 for more details, so data issues are falling by the wayside until that is implemented.

The source and scraper in Li is in https://github.com/covidatlas/li/blob/master/src/shared/sources/us/fl/index.js. If you can suggest changes to the scraper here or in Li, that would be super.

Thanks for reporting the issue. Cheers! jz

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/covidatlas/coronadatascraper/issues/1057#issuecomment-648887413, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEIAFOFCQNSSBEUFCZGZMDDRYIKXHANCNFSM4OGFQSAQ.

jzohrab commented 3 years ago

Hello @jsomer, following up on the original issue https://github.com/covidatlas/coronadatascraper/issues/1057. I don't believe this still an issue in the new reports -- https://covidatlas.com/data

I just ran a local crawl/scrape and got the following total count for Florida:

┌─────────┬─────────────────────────────────┬──────────────┬────────┬─────────┬────────┐
│ (index) │           locationID            │     date     │ cases  │ tested  │ deaths │
├─────────┼─────────────────────────────────┼──────────────┼────────┼─────────┼────────┤
...
│   67    │      'iso1:us#iso2:us-fl'       │ '2020-08-09' │ 526577 │ 3952028 │  8109  │
└─────────┴─────────────────────────────────┴──────────────┴────────┴─────────┴────────┘

This matches well, though not exactly, with the numbers at the the dashboard- https://experience.arcgis.com/experience/96dd742462124fa0b38ddedb9b25e429 . I'm not sure why the numbers don't match exactly, if you have time perhaps you could look into the underlying data (see the source for the URL) ... but that's different from this original issue.

image

I'll close this issue as it seems like we're on track. Cheers! jz