breedfides / airflow-etl

0 stars 2 forks source link

Implement geocoding DAG #17

Open gannebamm opened 4 months ago

gannebamm commented 4 months ago

As discussed in the BreedFides working group, there is a chance that datasets will not have a latitude and longitude geolocalisation but a geographic name, like 'Braunschweig' or 'Gattersleben'. An algorithm called geocoding determines the latitude and longitude coordinates of a given geographic name.

see more info here: https://nominatim.org/

Since our development budget is limited, we will not guarantee to provide such a service. Nonetheless, I invite @kuwet2k to try implementing a small DAG example within the current codebase. To test the functionality the free API of nomination shall be used: https://nominatim.org/release-docs/develop/api/Overview/

This is just a POC and not a service we will provide. Nonetheless, it can demonstrate how to extend the current BreedFides-ETL infrastructure for new needs. @arendd @feserm

arendd commented 4 months ago

Thank you for your effort @gannebamm. After the exchange with ProCorn, it seems that they unfortunately can not guarantee to provide geo locations for all fields, because they do not create the data, just extract it from diverse sources. Some can provide this information, some not. So we should keep this in mind.