NRGI / resource-projects-etl

ETL processes for rp.org
GNU General Public License v2.0
3 stars 2 forks source link

pycountrycode not handling 'Namibia' correctly #55

Closed timgdavies closed 8 years ago

timgdavies commented 8 years ago

For some reason, the version of pycountrycode we are using does not seem to give an ISO code for Namibia, instead only returned the country name.

Tested at console with:

from countrycode.countrycode import countrycode
print(countrycode(codes='Namibia'),origin='country_name',target="iso2c")

I've not yet checked if we should upgrade versions - but the data here which drives pycountrycode appears to be at least a year old, and include Namibia.

timgdavies commented 8 years ago

@bjwebb Any chance you can have a dig into this if time allows. It's not vital (as I don't think we're dealing with much Namibia data right now) but would be useful to see if there is a quick fix here.

Bjwebb commented 8 years ago

Looking at pip, the most recent released version 0.2 is from 2013.

Looks like Namibia is in there in 0.2, but they use NA for n/a, which breaks it!!!

Most obvious solution is to install the latest version from git, e.g. pip install --upgrade git+https://github.com/vincentarelbundock/pycountrycode.git@b608cd959b6285b7218d07ec574a4959104efa7f#egg=countrycode, or replacing the current countrycode line in requirements_taglifter.txt with -e git+https://github.com/vincentarelbundock/pycountrycode.git@b608cd959b6285b7218d07ec574a4959104efa7f#egg=countrycode

There's a possible problem here of this potentially being more unstable, but I think if we're pinning to a specific commit, I think it should be stable enough for our needs.

timgdavies commented 8 years ago

Thanks for exploring this and finding the problem.

Pinning to that specific commit sounds good.

Bjwebb commented 8 years ago

This failed travis tests, and haven't had chance to look at it since.

@timgdavies I saw you were switching from country names to country code somewhere, is this still useful?

timgdavies commented 8 years ago

Unfortunately this will still be an issue I think - so if we can get it working that would be very helpful.

idlemoor commented 8 years ago

The commit to fix the Namibia problem is https://github.com/vincentarelbundock/pycountrycode/commit/9819645b76abfdb670e0ecf757feaaa800ee2bbf -- as a quick and dirty fix do you want to try 9819645b76abfdb670e0ecf757feaaa800ee2bbf instead of b608cd9 to get it past Travis?

Having said that, someone opened an issue (which is still open) for this exact bug a couple of months after the commit that supposedly fixed it; make of that what you will. (In general it looks like pycountrycode is ripe for forking by a passing glory seeker)

timgdavies commented 8 years ago

I'm still seeing issues here. @Bjwebb can we get the fix in?

Bjwebb commented 8 years ago

@timgdavies Yep, hopefully.

Bjwebb commented 8 years ago

This should now be fixed (I've grabbed the commit @idlemoor suggests).

I've reloaded http://etl.nrgi-dev2.default.opendataservices.uk0.bigv.io/media/1838ed03-7c26-496e-90f3-b5a7034c5cc2/output.ttl and data shows at http://lodspeakr-live.nrgi-dev2.default.opendataservices.uk0.bigv.io/country/NA

timgdavies commented 8 years ago

Great. I've reloaded the country data as well, which means that page now displays the correct map location.

Thanks Ben.