owid / covid-19-data

Data on COVID-19 (coronavirus) cases, deaths, hospitalizations, tests • All countries • Updated daily by Our World in Data
https://ourworldindata.org/coronavirus
5.66k stars 3.64k forks source link

Hospitalization data from South Korea #2225

Closed WWolf closed 2 years ago

WWolf commented 2 years ago

This is related to the issue: https://github.com/owid/covid-19-data/issues/2219#issue-1093744144

(a) The official English page contains the new hospital admissions stats (for past 7 days) under "Current status of new hospitalizations": http://ncov.mohw.go.kr/en/bdBoardList.do?brdId=16&brdGubun=161&dataGubun=&ncvContSeq=&contSeq=&board_id=

(b) It also contains "Current status of hospitalizations with moderate to severe symptoms" which is effectively the number of patients under ventilation+ (this is not daily admissions but the current number of patients): http://ncov.mohw.go.kr/en/bdBoardList.do?brdId=16&brdGubun=161&dataGubun=&ncvContSeq=&contSeq=&board_id=

I think (a) could be integrated to the automatic crawling system, and (b) might be relevant to the ICU patients stats.

Thank you!

edomt commented 2 years ago

Hi @WWolf

Are full time series made available anywhere? It seems like these pages only include a snapshot of the latest figures, rather than a full historical time series.

WWolf commented 2 years ago

Hi @edomt

The seriously ill and hospitalized (I think like to ICU patient stats, not admissions) are maintained daily here (from 2020-03-28 onward): https://docs.google.com/spreadsheets/d/10c9jNi8VnV0YYCfV_7AZrzBY5l18dOFHEJMIJsP4THI/edit#gid=334130338 (The same Seoul National University group that parses the official government records)

As for the daily hospital admissions, the page update came out after November, so no stats before that. If that is OK, I could backfill the stats from the individual governmental sources (they have usually much more detailed reports in Korean, see here for example: http://ncov.mohw.go.kr/tcmBoardView.do?brdId=3&brdGubun=31&dataGubun=&ncvContSeq=6243&board_id=312&contSeq=6243 , unfortunately nowhere near API-wise).

Thank you!

edomt commented 2 years ago

Thanks! Do you know people from this group at Seoul National University? In order to use the data, we'd need to be able to access it as a CSV file.

(They can make the spreadsheet available by clicking on File > Share > Publish to the web. Then select "Link", "Cases in Korea_Original", "Comma-separated values", click "Publish", and copy the link)

WWolf commented 2 years ago

Hi @edomt,

I think one could just do one backfill, and then crawl all the relevant stats like https://github.com/owid/covid-19-data/pull/2217 as that has all the incremental values available. Would that work? Otherwise, I can contact them for the publication.

WWolf commented 2 years ago

Like the pull request related to https://github.com/owid/covid-19-data/issues/2219#issue-1093744144,

I have scraped "new admission" and patients from moderate to severe conditions (related to ICU patients?) tallies starting from 2021-11-01 to present from official KDCA (Korean) documentations.

P.S. Exact definition of Hospitalizations with moderate to severe symptoms: patients receiving isolated treatment through high flow therapy, respirator, ECMO (extracorporeal membrane oxygenation), and CRRT (continuous renal replacement therapy)

I also checked that the ICU patient stat further goes back to 2020-03-28, and SNU ARIC dataset is faithfully represents the official stat. So I am wondering whether additional scrape code like https://github.com/owid/covid-19-data/pull/2217 could be done with backfills. If you could point to the file, I will make a pull request!

edomt commented 2 years ago

Thanks! The main issue for now is that we don't have the structure in place to accept incremental data collection for hospital & ICU data. The scripts are located in scripts/src/cowidev/hosp/sources but they are all batch scripts, i.e. they download or scrape an entire time series and return it to generate the dataset.

Potentially, if could you host the whole time series as a CSV somewhere (e.g. your own GitHub repo) and update it somewhat regularly, we could pull it directly from there.

WWolf commented 2 years ago

Hi @edomt,

I have generated a simple repository: https://github.com/WWolf/korea-covid19-hosp-data

Hope this is OK. I will try to update it regularly (hope to do daily, but at least weekly).

edomt commented 2 years ago

Thank you very much! The data is now live:

WWolf commented 2 years ago

Hi, this is a slight update:

I found that KDCA also has Monday weekly reports of weekly hospital admissions, and ICU new admissions (see screenshot and link ).

image

The weekly hospital admissions can be inferred from the daily admissions that I am keeping track on github, but ICU new admissions might be something that could be added as well. So I have added a CSV github table and have updated this retrospectively up to last October.

https://github.com/WWolf/korea-covid19-hosp-data/blob/main/weekly_icu.csv

Hope this could be potentially put into ICU admissions in OWID.

edomt commented 2 years ago

Thank you @WWolf !

WWolf commented 2 years ago

Hi @edomt, if this is of any use, I have added a daily tracker for the total beds that are utilized in South Korea (the columns indicate the severity). So in essence, the sum of all columns for a given date would be comparable to the Hospital patients metric in OWID. If you intend to incorporate this, I can scrape also older data before 2022-07-25 (by scraping).

https://github.com/WWolf/korea-covid19-hosp-data/blob/main/beds.csv

edomt commented 2 years ago

Thank you @WWolf! We've just updated our script to include this data, it should go live over the next 24 hours.