MPEDS / mpeds

Machine-learning Protest Event Data System
http://mpeds.github.io
MIT License
35 stars 11 forks source link

LocationCoder: Getting name of state/ country when not a 'focus' #10

Open erleholgersen opened 7 years ago

erleholgersen commented 7 years ago

We currently get the name of the country a city or state is in by parsing through the countries entry of the CLIFF focus results, matching on the field countryGeoNameId. In some cases, a city might be a focus of an article even though the country is not, in which case this procedure fails to return a name.

Better approaches could be to use the GeoNames API (http://www.geonames.org/) or to parse through the mentions rather than the CLIFF focus results. Note that the latter approach will require a way to determine if a CLIFF entry is a country, as all cities/ states within that country will have the same countryGeoNameId tag.

(We also have this problem when finding the name corresponding to a stateGeoNameId).