thinkingmachines / geowrangler

🌏 A python package for wrangling geospatial datasets
https://geowrangler.thinkingmachin.es/
MIT License
47 stars 14 forks source link

Move dhs.py from geowrangler dir to geowrangler/datasets dir #244

Closed joshuacortez closed 1 month ago

joshuacortez commented 1 month ago

The dhs.py script seems out of place since it's not in the datasets subdirectory along with geofabrik.py, nightlights.py, and ookla.py. Might be good to move its location

butchtm commented 1 month ago

hi @joshuacortez -- our original rationale was that unlike the other dataset related modules -- this module not actually about downloading the dhs data, but for manipulating DHS data -- unlike the other datasets, DHS data is not available publicly so we cannot provide utils for downloading it but instead each user must request it from the source.

The utilities provided are for massaging the provided data in such a way as to make it easier to manage for downstream activities such as building training/validation data for ML models.

Despite this, @tm-danna-ang @joshuacortez @tm-jc-nacpil feel free to chime if you feel strongly that it would be better to leave it stand alone or moved to the datasets module.

joshuacortez commented 1 month ago

Thanks for explaining the rationale @butchtm !

I don't feel super strongly about it, but just thought that dataset-specific scripts (regardless of its just for download or also for manipulation) are together in the same directory

Curious to hear others' thoughts

tm-jc-nacpil commented 1 month ago

I also don't have strong feelings about it but I think better to keep it as is just in case it's being used by UNICEF/EAPRO -- changing the location will affect the import step and might break existing notebooks

joshuacortez commented 1 month ago

ok let's keep it as is then!