CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.1k stars 18.38k forks source link

FIPS 4 digits on some states 5 on most #2239

Open jscudder opened 4 years ago

jscudder commented 4 years ago

For csse_covid_19_daily_reports FIPS is a 4 digit code for States: Alabama, Alaska, Arizona, Arkansas, Connecticut, Colorado. This was not the case in older feeds. Please resolve.

Fred-Macdo commented 4 years ago

Second this. Was making county based maps and wondering why in recent days there were states missing. I was matching on string being 5 characters long with leading zeros. The first states alphabetically have been missing and it is because there is a 4 digit code which should be considered a string field not numerical.

robjellis commented 4 years ago

+1. Noticed that this first occurred on April 13 (https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/04-13-2020.csv), but the error is not present on April 12 (https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/04-12-2020.csv)

yywong01 commented 4 years ago

+1. Noticed that this first occurred on April 13 (https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/04-13-2020.csv), but the error is not present on April 12 (https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/04-12-2020.csv)

Exactly. I was using the data on 4/13, and it suddenly disappeared, replaced by state-level data only. My project gets hanging, dont know what to do now. If any of you have the FIP-level data on a March day or an earlier April day, would you be so kind to share? PLEASE???!

Fred-Macdo commented 4 years ago

First two characters of FIPS indicates state of that helps

On Fri, Apr 17, 2020 at 2:06 PM yywong01 notifications@github.com wrote:

+1. Noticed that this first occurred on April 13 ( https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/04-13-2020.csv), but the error is not present on April 12 ( https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/04-12-2020.csv )

Exactly. I was using the data on 4/13, and it suddenly disappeared, replaced by state-level data only. My project gets hanging, dont know what to do now. If any of you have the FIP-level data on a March day or an earlier April day, would you be so kind to share? PLEASE???!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CSSEGISandData/COVID-19/issues/2239#issuecomment-615387729, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADNMBCTDPAXF3RWV7SYJOX3RNCLARANCNFSM4MJNJWMA .

robjellis commented 4 years ago

Good point @Fred-Macdo ; I was originally worried that there might be trailing zeros that were also dropped, but it looks like all but only a small handful of FIPS codes (at least for US entities) are less than 4 digits. Thus a simple rule could be a hot fix until formally corrected; e.g., "if length of observed FIPS is 4 and state is in {'Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut'}, prepend observed FIPS with '0'."

In Python 3 for example, given dataframe jhu as one of the daily files:

fips5 = []
for i,j in zip(jhu['FIPS'],jhu['Province_State']):
    if len(str(i)) == 4 and j in {'Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut'}:
        fips5.append('0'+str(i))
    else:
        fips5.append(str(i))
yywong01 commented 4 years ago

When I looked at the file on 4/13, it contains hospitalization and testing rate, as described. But it is not there now. Nor can I find this info on prior files. Does anyone also notice that? Why the person (?) changes data like this? What needs to get fixed?

Does anyone save a file that has the testing and hospitalization rate?

On Fri, Apr 17, 2020 at 4:39 PM RJ Ellis notifications@github.com wrote:

Good point @Fred-Macdo https://github.com/Fred-Macdo ; I was originally worried that there might be trailing zeros that were also dropped, but it looks like all but only a small handful of FIPS codes (at least for US entities) are less than 4 digits. Thus a simple rule could be a hot fix until formally corrected; e.g., "if length of observed FIPS is 4 and state is in {Alabama, Alaska, Arizona, Arkansas, Connecticut, Colorado}, prepend observed FIPS with '0'."

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CSSEGISandData/COVID-19/issues/2239#issuecomment-615453343, or unsubscribe https://github.com/notifications/unsubscribe-auth/APGL3UCR6FVVBWAJC7ZIVRTRNC46VANCNFSM4MJNJWMA .

--


Yuet-Yee Wong email: yywong01@gmail.com mobile phone: 607-232-1752 URL: https://sites.google.com/view/wongyy/home