Abigail / geography--countries

2-letter, 3-letter, and numerical codes for countries
5 stars 5 forks source link

Problem with utf8::all #1

Open gerhardj opened 10 years ago

gerhardj commented 10 years ago
perl -Mutf8::all -MGeography::Countries -e ''

outputs

utf8 "\xF4" does not map to Unicode at /usr/local/share/perl/5.14.2/Geography/Countries.pm line 83, <DATA> line 45.
utf8 "\xE9" does not map to Unicode at /usr/local/share/perl/5.14.2/Geography/Countries.pm line 83, <DATA> line 180.

whereas this is fine

perl -MGeography::Countries -Mutf8::all -e ''
Abigail commented 10 years ago

On Wed, May 28, 2014 at 03:24:30AM -0700, gerhardj wrote:

perl -Mutf8::all -MGeography::Countries -e ''

outputs

utf8 "\xF4" does not map to Unicode at /usr/local/share/perl/5.14.2/Geography/Countries.pm line 83, <DATA> line 45.
utf8 "\xE9" does not map to Unicode at /usr/local/share/perl/5.14.2/Geography/Countries.pm line 83, <DATA> line 180.

whereas this is fine

perl -MGeography::Countries -Mutf8::all -e ''

I don't really consider this to be problem. Or at least, it's not something I should fix in Geography::Countries.

If one wants to decide to turn on utf8 "everywhere", then it's the responsibility of the person doing so. The module reads data from its DATA handle, and expects the data to be in the format it is. If one is going to say it's actually in UTF8 format when it isn't, one should not be surprised perl complains.

As to why the order matters, I don't know -- perhaps the authors of utf8::all can help you there. The documentation of utf8::all says it works lexically, so it shouldn't influence the internals of Geography::Countries.

Having said that, the current development version of Geography::Countries no longer reads data from DATA, and should not suffer from this issue.

Regards,

Abigail

gerhardj commented 10 years ago

Ok, thank you that sounds good, about the new version. Will there be a cpan release about it?