CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.12k stars 18.39k forks source link

Data consistency #1577

Open jonisapp opened 4 years ago

jonisapp commented 4 years ago

Please don't change the data structure everyday this is a pain for a lot of programmers. A large majority of the issues are related to that. You are responsible for important publicly available data on which a lot of academic and research projects now depend, please take it into account.

ghost commented 4 years ago

I agree, and willing to assist (27+ years of data modeling).

jonisapp commented 4 years ago

Are this kind of edits really useful ? image And there are now a lot of duplicates... Some I have found, hoping that it might help : image

ghost commented 4 years ago

Thank you. See #1280. Ideally at least the format, such as it is, will not change (I'm going off the daily reports for granularity and format). Then historic/pre-current format data, of vital importance, can be transformed. That will allow us to then focus on the quality, such as obvious duplicates, etc.

Also see #1250

jonisapp commented 4 years ago

Thank you, this sounds reassuring !

ghost commented 4 years ago

I'm particularly encouraged by https://coronadatascraper.com/#timeseries.csv. Found it on one of the covid-atlas.slack.com channels. The data appears normalized/static format, from a variety of sources (including here), and cross-checked.