carpentries / amy

A web-based workshop administration application built using Django.
https://amy.carpentries.org
MIT License
113 stars 72 forks source link

Validate location country & lat/lon for events #2555

Open maneesha opened 12 months ago

maneesha commented 12 months ago

We have a LOT of workshops where there is a mismatch between the country and lat/lon in the location -- i.e., the lat/lon coordinates are not in the same country. I have a Python script I wrote to identify these mismatches and will share it here shortly. The Workshop Admin Team has a plan to review and correct this data.

Raising this issue here to recommend that we introduce a validation whenever location data (specifically country & lat/long) are created or edited, there is a check to see if they match and asks the user to correct the data before saving the event.

maneesha commented 12 months ago

Marking this as medium priority because our high priority list is getting big but I would like to see this addressed soon because it impacts the data that we present.

maneesha commented 12 months ago

I have been using the geopy library to get country from lat/lon. From here, we can compare the returned country with the country stored in AMY for that event. Raise error if they do not match or if the lat/lon isn't a valid country.

We may have to have some override, like when we do a workshop at McMurdo in Antarctica, which does not have a country designation, or when territories of countries are classified differently (like whether Puerto Rico is part of the United States).

>>> from geopy.geocoders import Nominatim
>>> def get_country_from_latlon(lat, lon):
...     geolocator = Nominatim(user_agent="foo_bar")
...     coordinates = (lat, lon)
...     location = geolocator.reverse(coordinates)
...     try: 
...         country = location.raw['address']['country_code'].upper()
...         return country
...     except:
...         return ('Not a country', location)
... 

>>> get_country_from_latlon(-11.927, -54.2562) # Random spot in Brazil
'BR'
>>> get_country_from_latlon(-77.842212688, 166.64834) # McMurdo base in Antarctica
('Not a country', Location(Thomas Williams, Hut Point Ridge Loop, Arrival Heights Lab, McMurdo Station, (-77.8431267, 166.6471052, 0.0)))
>>> get_country_from_latlon(-14.378860, -22.16097) # Random point in Atlantic Ocean 
('Not a country', None)
>>> get_country_from_latlon(18.38836, -66.54567) # Puerto Rico
'US'

(Edit to add geopy link)

elichad commented 12 months ago

Thanks for identifying this problem @maneesha!

It looks like the country designation also depends on which geocoder you use from that package - Google Maps seems to recognise both Antarctica and Puerto Rico in the examples you shared. So we should be careful to align our choice of geocoder with our existing list of countries - we use django-countries which in turn uses the ISO 3166-1 list.