somnathrakshit / geograpy3

Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.
https://geograpy3.readthedocs.io
Apache License 2.0
122 stars 12 forks source link

United Kingdom not recognised as a country. #49

Closed berengamble closed 3 years ago

berengamble commented 3 years ago

United Kingdom isn't matched as a country.

>>> import geograpy
>>> geograpy.get_geoPlace_context(text='United Kingdom').countries
[]
>>> geograpy.get_geoPlace_context(text='UK').countries
[]
>>> geograpy.get_geoPlace_context(text='Great Britain').countries
[]
>>> geograpy.get_geoPlace_context(text='GB').countries
[]
>>> geograpy.get_geoPlace_context(text='United States').countries
['United States']

Expected behavior Should match some or all variations of United Kingdom.

Version geograpy3 0.1.27

WolfgangFahl commented 3 years ago

The NLP part is failing. You might want to use the PlaceContext directly or the new locator interface. Please stay tuned for the upcoming release

 def testIssue49(self):
        '''
        country recognition
        '''
        texts=['United Kingdom','UK','Great Britain','GB','United States']
        print("lookup with geograpy.get_geoPlace_context")
        for text in texts:
            countries=geograpy.get_geoPlace_context(text=text).countries
            print (f"{text}:{countries}")
        print("lookup with PlaceContext")
        for text in texts:
            pc=PlaceContext([text])  
            pc.set_countries()
            print (f"{text}:{pc.countries}")
Starting test testIssue49, debug=False ...
lookup with geograpy.get_geoPlace_context
United Kingdom:['United States of America']
UK:[]
Great Britain:['United Kingdom', 'United States of America']
GB:[]
United States:['United States of America']
lookup with PlaceContext
United Kingdom:['United Kingdom']
UK:['United Kingdom']
Great Britain:['United Kingdom']
GB:['United Kingdom']
United States:['United States of America']
----------------------------------------------------------------------
Ran 1 test in 2.740s