Closed troymartinhughes closed 3 years ago
Thank you for (1) Providing this information! (2) Validating that I am not crazy or missing a former data format change notice
If columns are going to be added my thoughts are that (a) the dictionary is updated (b) people are notified, ideally with time to make code changes (c) former files are updated with "not yet implemented" type values OR the old format is maintained alongside the new format going forward. Yet I know things are moving fast and do appreciate that JHU is providing this information.
You're not crazy; I trust that JHU is doing the best they can to keep the data updated, but some of the documentation is lacking or has been outpaced by changes in the data.
I run daily quality control reports (100% automated), which both identify and track many of these issues; today's are attached here. JHU_COVID-19_Daily_Reports_US_Data_Quality_Report_20200602.pdf JHU_COVID-19_Daily_Reports_US_Longitudinal_Data_Quality_Report_20200602.pdf
Hi folks. You're not crazy. These have been issues for a long time (JHU are amazing and probably too busy to address them all). Based on your comments, you may be interested in a cleaned repo of the JHU data, which I maintain HERE
Hello @troymartinhughes! Thanks very much for pointing out those errors. We have updated the description. Please let us know if anything else need to be adjusted. Sorry for the late response.
I’ve identified four separate issues here with this page (https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data):
Thank you to whoever is able to make these changes, and especially to JHU for their continued support!
For reference, the following four distinct column configurations and naming conventions occur in the CSV files in this folder (https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports): Columns included in CSV files dated 01-22-2020 to 03-01-2020: Province/State,Country/Region,Last Update,Confirmed,Deaths,Recovered Columns included in CSV files dated 03-01-2020 to 03-22-2020: Province/State,Country/Region,Last Update,Confirmed,Deaths,Recovered,Latitude,Longitude Columns included in CSV files dated 03-22-2020 to 05-29-2020: FIPS,Admin2,Province_State,Country_Region,LastUpdate,Lat,Long,Co nfirmed,Deaths,Recovered,Active,Combined_Key Columns included in CSV files dated 05-29-2020 to 05-30-2020: FIPS,Admin2,Province_State,Country_Region,LastUpdate,Lat,Long,Co nfirmed,Deaths,Recovered,Active,Combined_Key,Incidence_Rate,Case-F atality_Ratio
Column Mapping
Finally, assuming these changes are made at some point, I've memorialized the current "Field description" section below:
For reference, on 05-31-2020, the posted metadata state the following: