cmu-delphi / covidcast-indicators

Back end for producing indicators and loading them into the COVIDcast API.
https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html
MIT License
12 stars 17 forks source link

Megacounty-like FIPS in JHU data not documented #236

Closed capnrefsmmat closed 3 years ago

capnrefsmmat commented 4 years ago
> library(covidcast)
> deaths = covidcast_signal("jhu-csse", "deaths_7dav_incidence_num", "2020-06-15", "2020-07-01")
> deaths[49745,]
A `covidcast_signal` data frame with 1 rows and 10 columns.

data_source : jhu-csse
signal      : deaths_7dav_incidence_num
geo_type    : county

      geo_value time_value direction      issue lag     value stderr
49745     34000 2020-06-30        NA 2020-08-14  45 -266.8571     NA
      sample_size data_source                    signal
49745          NA    jhu-csse deaths_7dav_incidence_num

geo_value 34000 would mean FIPS code 34000 -- which is not a valid FIPS code. We would use this as a megacounty in other sources, but there's no such thing with cases/deaths data.

I assume this is used for deaths not assigned to a specific county within New Jersey (FIPS 34), but this isn't documented in the JHU signal documentation. It should be added there.

capnrefsmmat commented 4 years ago

I can confirm this signal is for deaths reported as "Unassigned" by JHU; the corresponding row in their CSV is

84090034,US,USA,840,90034.0,Unassigned,New Jersey,US,0.0,0.0,"Unassigned, New Jersey, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,8,11,6,13,22,42,63,91,115,138,114,161,346,121,148,175,227,281,348,406,474,545,599,673,748,822,875,934,995,1044,1101,1143,1182,1454,1257,1287,1320,1360,1392,1429,1462,1483,1509,1534,1560,1585,1609,1619,1638,1653,1675,1689,1706,1713,1725,1734,1743,1747,1752,1755,1761,1762,1771,1777,1784,1787,1789,1794,1800,1804,1809,1814,1816,1820,1824,1828,1830,1835,1838,1839,1840,1841,1845,1847,1848,1852,1855,1856,1857,1859,1859,1860,1860,1861,1863,1866,1868,1868,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

The sudden drop to 0 (on June 25th) suggests they changed something in how they report unassigned cases in New Jersey on that day, because other counties in New Jersey show a spike at the same time.

See also https://github.com/CSSEGISandData/COVID-19/issues/2763