thegraphnetwork / epigraphhub_py

Epigraphhub Python package
GNU General Public License v3.0
2 stars 9 forks source link

World bank data collection #149

Closed fccoelho closed 2 years ago

fccoelho commented 2 years ago

🚀 Feature Request

develop a module for collecting public datasets from the World Bank. Here is one example: https://databank.worldbank.org/data/download/Gender_Stats_csv.zip Metadata is available through the API endpoint: https://api.worldbank.org/v2/sources/14/indicators

🔈 Motivation

Several of these datasets are very relevant to public health analyses

fccoelho commented 2 years ago

Here is a good filtered list of interesting datasets: https://datacatalog.worldbank.org/search?fq=(Resources%2Fany(res:res%2Fformat%20eq%20%27CSV%27)%20or%20Resources%2Fany(res:res%2Fformat%20eq%20%27EXCEL%27)%20or%20Resources%2Fany(res:res%2Fformat%20eq%20%27STATA%27)%20or%20Resources%2Fany(res:res%2Fformat%20eq%20%27VECTOR%20API%27)%20or%20Resources%2Fany(res:res%2Fformat%20eq%20%27ZIP%27))&q=gender

dcpcamara commented 2 years ago

@fccoelho there is a R package that deals with downloading and processing the World Bank data. It is called WDI and is hosted at https://cran.r-project.org/web/packages/WDI/index.html. Maybe we could be using this package to automatically pull the datasets into our database? Or would you like us to develop our own functions for this?

fccoelho commented 2 years ago

@dcpcamara that's good, but first we must find out what datasets would be interesting to us and if this R library can get us those datasets. That filtered list I linked above is a good starting point.