covidatlas / li

Next-generation serverless crawler for COVID-19 data
Apache License 2.0
57 stars 33 forks source link

0 new cases for Kentucky/Indiana from timeseries.csv on 08/23 #599

Open patricksheehan opened 4 years ago

patricksheehan commented 4 years ago

Hi there. I appreciate the detailed issue requirements, but I don't really have extra time to debug this. We noticed that all Kentucky/Indiana counties are reporting 0 new cases from the "timeseries.csv" data set. NYT reports cases as do other county-level sources.

New cases were calculated via differencing, so the case totals have just been the same for these counties. Any idea what's going on?

jzohrab commented 4 years ago

Hi @patricksheehan - thanks for the heads up.

Per https://api.covidatlas.com/status?format=html, the Kentucky (us-ky) and Indiana (us-in) sources are working ok, and both have crawled and scraped recently, but that doesn't mean that they're actually getting maintained correctly by the site owners.

For Kentucky, we reference https://datawrapper.dwcdn.net/BbowM/283/. I'm not sure when/how that was determined as a good source. https://govstatus.egov.com/kycovid19 may be better, or one of its subpages, if we can figure out how to parse it. Any suggesions?

For Indiana, we're using https://opendata.arcgis.com/datasets/d14de7e28b0448ab82eb36d6f25b1ea1_0.csv, which is "ISDH_COVID-19_County_Data.csv". It looks like that's from https://www.coronavirus.in.gov/.

I wonder if any of this helps ... let me know. Cheers, jz

patricksheehan commented 4 years ago

@jzohrab this does help, thank you! Do you have the ability to use different sources for different variables? For instance, for cases data, NYT seems to be stable and correct consistently.

We (covidexitstrategy.org) have also considered using covidcountydata.org, have you talked with those folks? Perhaps you can integrate. I like their API because I can do some filtering, etc. which keeps payloads down.

jzohrab commented 4 years ago

Hi Patrick, I’ve been considering creating a report that simply dumps all data we collect for all locations. Currently we combine data by priority, but I think that isn’t good for clients that are doing their own calculations. That should be rather simple to make. I’ll start that soon and will ping you with an ETA.

I haven’t seen covid county, thanks for the link. I’ll check them out. Cheers! Z

El El lun, ago. 24, 2020 a la(s) 9:41 p. m., Patrick Sheehan < notifications@github.com> escribió:

@jzohrab https://github.com/jzohrab this does help, thank you! Do you have the ability to use different sources for different variables? For instance, for cases data, NYT seems to be stable and correct consistently.

We (covidexitstrategy.org) have also considered using covidcountydata.org, have you talked with those folks? Perhaps you can integrate. I like their API because I can do some filtering, etc. which keeps payloads down.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/covidatlas/li/issues/599#issuecomment-679468623, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMPWDJHN3G2CUSQD2ZKHSLSCMQHBANCNFSM4QJYJVXQ .