covidcaremap / covid19-healthsystemcapacity

Open geospatial work to support health systems' capacity (providers, supplies, ventilators, beds, meds) to effectively care for rapidly growing COVID19 patient needs
https://www.covidcaremap.org
MIT License
97 stars 38 forks source link

Validate hospital bed stats against other datasets #12

Open lossyrob opened 4 years ago

lossyrob commented 4 years ago

The goal of this issue is to cross validate the statistics in the dataset generated by this project for the number of hospital beds per county against external datasources. Any mismatches of data should be documented and accounted for; numbers won't match exactly but making sure external sources are indicating that the number's were generating are at least close will provide a great test to make sure we're generating valid data.

https://www.kff.org/other/state-indicator/beds-by-ownership/?currentTimeframe=0&sortModel=%7B%22colId%22:%22Location%22,%22sort%22:%22asc%22%7D

https://www.sccm.org/Communications/Critical-Care-Statistics

AHA data: According to the AHA 2015 annual survey, the United States had

HCRIS data:

ICU days: HCRIS analysis showed that there were 150.9 million hospital days, including 25 million ICU days in 2010 (16.5% ICU days/total days). Medicare accounted for 7.9 million ICU days (31.4%) and Medicaid 4.3 million ICU days (17.2%).

Occupancy: Occupancy rates were calculated from HCRIS (days/possible days) data. In 2010, hospital and ICU occupancy rates were 64.6% and 68%, respectively. Occupancy rates vary by hospital size, with higher occupancy rates associated with larger hospitals.

lossyrob commented 4 years ago

https://github.com/daveluo/covid19-healthsystemcapacity/issues/7#issuecomment-599308581

daveluo commented 4 years ago

For another source of beds data to crosscheck and sanity check our methodology, here is the American Hospital Directory data which gives by state and nationally the total staffed beds count using the same HCRIS data source as the approach in #21.

https://www.ahd.com/definitions/statistics.html:

Staffed Beds Numbers of staffed beds are taken from a hospital's most recent Medicare cost report (W/S S-3, Part I, col.1). Cost report instructions define staffed beds as, "the number of beds available for use by patients at the end of the cost reporting period. A bed means an adult bed, pediatric bed, birthing room, or newborn bed maintained in a patient care area for lodging patients in acute, long term, or domiciliary areas of the hospital. Beds in labor room, birthing room, postanesthesia, postoperative recovery rooms, outpatient areas, emergency rooms, ancillary departments, nurses' and other staff residences, and other such areas which are regularly maintained and utilized for only a portion of the stay of patients (primarily for special procedures or not for inpatient lodging) are not termed a bed for these purposes." The total number of general med/surg beds plus special care beds are reported:

  • General Medical/Surgical Beds are the beds used for routine care.
  • Special Care Beds include Intensive Care Units, Coronary Care Units, etc.
Source Total Staffed Beds count (all states & territories)
AHD.com 745,957
Our notebook 742,562

Still a slight discrepancy to figure out (maybe AHD.com uses a different FY of HCRIS reports vs us using FY2018) but more in the right neighborhood than the ~200K difference in beds that AHA's numbers show per https://github.com/daveluo/covid19-healthsystemcapacity/issues/7#issuecomment-599308581

lossyrob commented 4 years ago

Difference between KFF hospital bed counts and the v2 HCRIS data (based on us census 2019 estimates), ordered by difference.

Screen Shot 2020-03-17 at 10 36 39 PM
lasingallday commented 3 years ago

I've started looking into comparisons of these datasets, and I agree that SCCM data will vary from that of KFF and AHD. From what I've read on the links provided, SCCM comes from a combination of 2015 AHA Annual Survey data and 2010 HCRIS data. So SCCM results will differ from KFF results (which use 2018 AHA Annual Survey data) and AHD results (which use a combination of the following: most recently reported HCRIS dataset (assuming 2018), MedPAR, OPPS, and propietary sources). Additionally, I feel like AHD should be the most trusted, as it is merging multiple datasets.

I also agree that there is something to look into in regards to the differences in total staffed beds between KFF and v2 HCRIS, particularly for the states . @lossyrob Is there a particular notebook to use for looking into these state-level total staffed bad differences? Or, alternatively, are these datasets already loaded and I can see the raw data in python? Lastly, is it fine for me to roll up the raw data to state-level, if such needs to be done?

Yhinkar commented 3 years ago

I found this issue on ovio.org and would love to contribute!