Closed miguelwon closed 4 years ago
If you have a bunch of extracted place names and you know which country they're from, you can use Mordecai code that interfaces with geonames to look things up. You should be able to run your custom Geonames index instead of the pre-built one, as long as everything is in the same format. Here's some code I've used when I wanted the coordinates of place names that I knew were cities in Syria:
def lookup_city(city, iso3c="SYR"):
"""
Return the "best" Geonames entry for a city name.
Queries the ES-Geonames gazetteer for the the given city and Syria,
and uses a set of rules to determine the best result to return. More
accurate/precise feature codes are preferred.
This code was taken from Halterman's (2019) Syria casualties working paper and
designed for geolocating Shuhada casualty data.
Parameters
----------
placename: str
The name of the city to look up
iso3c: str
The three character country code
Returns
-------
match: dict or list
The single entry from Geonames that best matches the query, or [] if no match at all.
"""
res = geo.query_geonames_country(city, iso3c)
res = res['hits']['hits']
# look for a neighborhood in the province
match = [i for i in res if i['feature_code'] in ['PPL', 'PPLA', 'PPLC', 'PPLA2', 'PPLA3', 'PPLA3']]
if match:
if len(match) == 1:
return match[0]
else:
m = check_exact(city, match)
return m
else:
match = [i for i in res if i['feature_code'] in ['PPLX', 'LCTY', 'PPLL', 'AREA']]
if match:
m = check_exact(city, match)
return m
else:
return None
How can I use it for a very specific problem, where I have my spacy model, I don't need the country model because I know all mentions are from one country and I have also my own index of geonames?