globaldothealth / list

Repository for Global.health: a data science initiative to enable rapid sharing of trusted and open public health data to advance the response to infectious diseases.
MIT License
39 stars 7 forks source link

Ingestion for Kenya data set #2177

Open iamleeg opened 3 years ago

iamleeg commented 3 years ago

Data source: https://github.com/SamuelBrand1/kenya-covid-three-waves/tree/main/data Publication: https://www.science.org/doi/10.1126/science.abk0414?utm_campaign=SciMag&utm_source=Social&utm_medium=Twitter#con1

The data is in HDF5 format (a packed binary format usually used in high-performance computing), so we need support in the parser lib. Also the data dump in GitHub is four months old so this may be a one-off ingestion, not a scheduled activity.

Mougk commented 3 years ago

For now it will be a one-time ingestion so we could do a local transformation of the file into CSV and then do a manual upload? I'll get in touch with the authors whether they are posting these data more regularly going forward.