episphere / mortalitytracker

tracking causes of death from CDC data APIs
4 stars 8 forks source link

States with drop in mortality skewing excess totals still #13

Open djhopkins2 opened 3 years ago

djhopkins2 commented 3 years ago

The real issue in #11 was never addressed. A new issue has cropped up with West Virginia's data and it seems to be skewing the results for the US excess death totals. The suggested fix from #11 would fix that and correct for the other states with drops in mortalities. Also, for the states whee tabulation is still in process but have passed above the average, they could still be added to the total.

djhopkins2 commented 3 years ago

I happened upon a CDC dashboard that does these same calculations and the technical notes provide a better explanation of the issue. CDC-Excess Deaths Associated with COVID-19

"Estimates of excess deaths for the US overall were computed as a sum of jurisdiction-specific numbers of excess deaths (with negative values set to zero), and not directly estimated using the Farrington surveillance algorithms. Summation (rather than estimation) was chosen to account for the possibility that some jurisdictions may have substantially incomplete data while other jurisdictions report may more deaths than expected, these negative and positive values will cancel each other out when estimating excess deaths for the US directly using the Farrington surveillance algorithms. Until data are finalized (typically 12 months after the close of the data year), it is not possible to determine whether observed decreases in mortality using provisional data are due to true declines or to incomplete reporting. Thus, when computing excess deaths directly for the US, negative values due to incomplete reporting in some jurisdictions will offset excess deaths observed in other jurisdictions. For example, the total number of excess deaths in the US computed directly for the US using the Farrington algorithms was approximately 25% lower than the number calculated by summing across the jurisdictions with excess deaths. This difference is likely due to several jurisdictions reporting lower than expected numbers of deaths – which could be a function of underreporting, true declines in mortality in certain areas, or a combination of these factors. In addition, potential discrepancies between the number of excess deaths in the US when estimated directly compared with the sum of jurisdiction-specific estimates could be related to different estimated thresholds for the expected number of deaths in the US and across the jurisdictions."

Since my javascript is a bit rusty, I decided to forgo fixing the issue in the mortality tracker code. However, I did reimplement an auto updating google sheet dashboard that pulls data from the same CDC source and doesthe fixed calculation along side the calculation from this dashboard to highlight the discrepancy. The start week, Jurisdiction, and Rangeof years for the historical average can be changed. Google Sheets Excess Mortality Calculation

lopezdp commented 3 years ago

I happened upon a CDC dashboard that does these same calculations and the technical notes provide a better explanation of the issue. CDC-Excess Deaths Associated with COVID-19

"Estimates of excess deaths for the US overall were computed as a sum of jurisdiction-specific numbers of excess deaths (with negative values set to zero), and not directly estimated using the Farrington surveillance algorithms. Summation (rather than estimation) was chosen to account for the possibility that some jurisdictions may have substantially incomplete data while other jurisdictions report may more deaths than expected, these negative and positive values will cancel each other out when estimating excess deaths for the US directly using the Farrington surveillance algorithms. Until data are finalized (typically 12 months after the close of the data year), it is not possible to determine whether observed decreases in mortality using provisional data are due to true declines or to incomplete reporting. Thus, when computing excess deaths directly for the US, negative values due to incomplete reporting in some jurisdictions will offset excess deaths observed in other jurisdictions. For example, the total number of excess deaths in the US computed directly for the US using the Farrington algorithms was approximately 25% lower than the number calculated by summing across the jurisdictions with excess deaths. This difference is likely due to several jurisdictions reporting lower than expected numbers of deaths – which could be a function of underreporting, true declines in mortality in certain areas, or a combination of these factors. In addition, potential discrepancies between the number of excess deaths in the US when estimated directly compared with the sum of jurisdiction-specific estimates could be related to different estimated thresholds for the expected number of deaths in the US and across the jurisdictions."

Since my javascript is a bit rusty, I decided to forgo fixing the issue in the mortality tracker code. However, I did reimplement an auto updating google sheet dashboard that pulls data from the same CDC source and doesthe fixed calculation along side the calculation from this dashboard to highlight the discrepancy. The start week, Jurisdiction, and Rangeof years for the historical average can be changed. Google Sheets Excess Mortality Calculation

@djhopkins2 id love to take a stab at this issue. reach out any time, ill come back and ask some questions after doing some research on issue #11

Link in Bio! 😂