Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.99k stars 557 forks source link

locale changes in 5.19.1 break LC_NUMERIC handling #13089

Closed p5pRT closed 10 years ago

p5pRT commented 11 years ago

Migrated from rt.perl.org#118795 (status was 'resolved')

Searchable as RT118795$

p5pRT commented 11 years ago

From @JohnPeacock

Created by jpeacock@cpan.org

A recent CPAN smoke report​:

http​://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74-6f4c697f9654

pointed to a regression in LC_NUMERIC that affects the version.pm tests.

I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).

John

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.14.2: Configured by Debian Project at Mon Mar 18 19:04:34 UTC 2013. Summary of my perl5 (revision 5 version 14 subversion 2) configuration: Platform: osname=linux, osvers=2.6.42-37-generic, archname=i686-linux-gnu-thread-multi-64int uname='linux aatxe 2.6.42-37-generic #58-ubuntu smp thu jan 24 15:28:10 utc 2013 i686 i686 i386 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i686-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.14 -Darchlib=/usr/lib/perl/5.14 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.14.2 -Dsitearch=/usr/local/lib/perl/5.14.2 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.14.2 -des' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.6.3', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=, so=so, useshrplib=true, libperl=libperl.so.5.14.2 gnulibc_version='2.15' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector' Locally applied patches: @INC for perl 5.14.2: /etc/perl /usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 /usr/local/lib/site_perl . Environment for perl 5.14.2: HOME=/home/jpeacock LANG=en_US.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/jpeacock/perl5/perlbrew/bin:/home/jpeacock/bin:/home/jpeacock/working/android-sdk-linux/platform-tools/:/usr/local/bin:/home/jpeacock/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games PERLBREW_BASHRC_VERSION=0.64 PERLBREW_HOME=/home/jpeacock/.perlbrew PERLBREW_PATH=/home/jpeacock/perl5/perlbrew/bin PERLBREW_ROOT=/home/jpeacock/perl5/perlbrew PERLBREW_VERSION=0.27 PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 11 years ago

From @JohnPeacock

test.pl

p5pRT commented 11 years ago

From @jkeenan

On Sun Jul 07 17​:17​:00 2013\, john.peacock@​havurah-software.org wrote​:

This is a bug report for perl from jpeacock@​cpan.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.

----------------------------------------------------------------- [Please describe your issue here]

A recent CPAN smoke report​:

http​://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74- 6f4c697f9654

pointed to a regression in LC_NUMERIC that affects the version.pm tests.

I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).

John

Is it possible for you to test this against blead (or git bisect)?

I would suggest at this commit​: 3a910aa05dd92ca95ba9356954fc38a67acba3f0

And then again at this commit​: 9a2ee86b20e06a3f64ee243168245198de7e07e4

In between\, there were several commits where 'locale' appeared in the commit message.

Thank you very much. Jim Keenan

p5pRT commented 11 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 11 years ago

From @jkeenan

On Sun Jul 07 17​:58​:53 2013\, jkeenan wrote​:

On Sun Jul 07 17​:17​:00 2013\, john.peacock@​havurah-software.org wrote​:

This is a bug report for perl from jpeacock@​cpan.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.

----------------------------------------------------------------- [Please describe your issue here]

A recent CPAN smoke report​:

http​://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74- 6f4c697f9654

pointed to a regression in LC_NUMERIC that affects the version.pm tests.

I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).

John

Is it possible for you to test this against blead (or git bisect)?

I would suggest at this commit​: 3a910aa05dd92ca95ba9356954fc38a67acba3f0

And then again at this commit​: 9a2ee86b20e06a3f64ee243168245198de7e07e4

In between\, there were several commits where 'locale' appeared in the commit message.

Better still\, re-test at the break between PASS and FAIL below​:

e92587081904f9cc586bf4b6898fec2187a5a95a PASS b127e37e51c21b0a36755dcd19811be931a03d83 PASS 68e8f474bc686a86c064b695b9c7400313d7af65 FAIL c1b30ad9d14ad1617d1aab860bdd5cf4ac66833a FAIL 3a910aa05dd92ca95ba9356954fc38a67acba3f0 FAIL 9a2ee86b20e06a3f64ee243168245198de7e07e4 FAIL

p5pRT commented 11 years ago

From @khwilliamson

On 07/07/2013 06​:58 PM\, James E Keenan via RT wrote​:

On Sun Jul 07 17​:17​:00 2013\, john.peacock@​havurah-software.org wrote​:

This is a bug report for perl from jpeacock@​cpan.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.

----------------------------------------------------------------- [Please describe your issue here]

A recent CPAN smoke report​:

http​://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74- 6f4c697f9654

pointed to a regression in LC_NUMERIC that affects the version.pm tests.

I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).

John

Is it possible for you to test this against blead (or git bisect)?

I would suggest at this commit​: 3a910aa05dd92ca95ba9356954fc38a67acba3f0

And then again at this commit​: 9a2ee86b20e06a3f64ee243168245198de7e07e4

In between\, there were several commits where 'locale' appeared in the commit message.

Thank you very much. Jim Keenan

--- via perlbug​: queue​: perl5 status​: new https://rt-archive.perl.org/perl5/Ticket/Display.html?id=118795

If it is the same 07locale.t as is in the core\, 68e8f474bc686a86c064b695b9c7400313d7af65 broke that one. It was relying on buggy behavior. It was failing to 'use locale'. The updated version of that test file is attached; it includes new tests.

p5pRT commented 11 years ago

From @khwilliamson

#! /usr/local/perl -w # Before `make install' is performed this script should be runnable with # `make test'. After `make install' it should work as `perl test.pl'

#########################

use File​::Basename; use File​::Temp qw/tempfile/; use POSIX qw/locale_h/; use Test​::More tests => 9; use Config;

BEGIN {   use_ok('version'\, 0.9902); }

SKIP​: {   skip 'No locale testing for Perl \< 5.6.0'\, 8 if $] \< 5.006;   skip 'No locale testing without d_setlocale'\, 8 if(!$Config{d_setlocale});   # test locale handling   my $warning;

  use locale;

  local $SIG{__WARN__} = sub { $warning = $_[0] };

  my $ver = 1.23; # has to be floating point number   my $loc;   my $orig_loc = setlocale(LC_NUMERIC);   is ($ver\, '1.23'\, 'Not using locale yet');   while (\) {   chomp;   $loc = setlocale( LC_ALL\, $_);   last if localeconv()->{decimal_point} eq '\,';   }   skip 'Cannot test locale handling without a comma locale'\, 7   unless $loc and localeconv()->{decimal_point} eq '\,';

  diag ("Testing locale handling with $loc") unless $ENV{PERL_CORE};

  setlocale(LC_NUMERIC\, $loc);   ok ("$ver eq '1\,23'"\, "Using locale​: $loc");   $v = version->new($ver);   unlike($warning\, qr/Version string '1\,23' contains invalid data/\,   "Process locale-dependent floating point");   ok ($v == "1.23"\, "Locale doesn't apply to version objects");   ok ($v == $ver\, "Comparison to locale floating point");

  {   no locale;   ok ("$ver eq '1.23'"\, "Outside of scope of use locale");   }

  ok("\"$ver\"+1 gt 2.22" && \"$ver\"+1 lt 2.24"\,   "Can do math when radix is not a dot"); # [perl 115800]

  setlocale( LC_ALL\, $orig_loc); # reset this before possible skip   skip 'Cannot test RT#46921 with Perl \< 5.008'\, 1   if ($] \< 5.008);   skip 'Cannot test RT#46921 with pure Perl module'\, 1   if exists $INC{'version/vpp.pm'};   my ($fh\, $filename) = tempfile('tXXXXXXX'\, SUFFIX => '.pm'\, UNLINK => 1);   (my $package = basename($filename)) =~ s/\.pm$//;   print $fh \<\<"EOF"; package $package; use locale; use POSIX qw(locale_h); \$^W = 1; use version; setlocale (LC_ALL\, '$loc'); use version ; eval "use Socket 1.7"; setlocale( LC_ALL\, '$orig_loc'); 1; EOF   close $fh;

  eval "use lib '.'; use $package;";   unlike($warning\, qr"Version string '1\,7' contains invalid data"\,   'Handle locale action-at-a-distance');   }

__DATA__ af_ZA af_ZA.utf8 an_ES an_ES.utf8 az_AZ.utf8 be_BY be_BY.utf8 bg_BG bg_BG.utf8 br_FR br_FR@​euro br_FR.utf8 bs_BA bs_BA.utf8 ca_ES ca_ES@​euro ca_ES.utf8 cs_CZ cs_CZ.utf8 da_DK da_DK.utf8 de_AT de_AT@​euro de_AT.utf8 de_BE de_BE@​euro de_BE.utf8 de_DE de_DE@​euro de_DE.utf8 de_LU de_LU@​euro de_LU.utf8 el_GR el_GR.utf8 en_DK en_DK.utf8 es_AR es_AR.utf8 es_BO es_BO.utf8 es_CL es_CL.utf8 es_CO es_CO.utf8 es_EC es_EC.utf8 es_ES es_ES@​euro es_ES.utf8 es_PY es_PY.utf8 es_UY es_UY.utf8 es_VE es_VE.utf8 et_EE et_EE.iso885915 et_EE.utf8 eu_ES eu_ES@​euro eu_ES.utf8 fi_FI fi_FI@​euro fi_FI.utf8 fo_FO fo_FO.utf8 fr_BE fr_BE@​euro fr_BE.utf8 fr_CA fr_CA.utf8 fr_CH fr_CH.utf8 fr_FR fr_FR@​euro fr_FR.utf8 fr_LU fr_LU@​euro fr_LU.utf8 gl_ES gl_ES@​euro gl_ES.utf8 hr_HR hr_HR.utf8 hu_HU hu_HU.utf8 id_ID id_ID.utf8 is_IS is_IS.utf8 it_CH it_CH.utf8 it_IT it_IT@​euro it_IT.utf8 ka_GE ka_GE.utf8 kk_KZ kk_KZ.utf8 kl_GL kl_GL.utf8 lt_LT lt_LT.utf8 lv_LV lv_LV.utf8 mk_MK mk_MK.utf8 mn_MN mn_MN.utf8 nb_NO nb_NO.utf8 nl_BE nl_BE@​euro nl_BE.utf8 nl_NL nl_NL@​euro nl_NL.utf8 nn_NO nn_NO.utf8 no_NO no_NO.utf8 oc_FR oc_FR.utf8 pl_PL pl_PL.utf8 pt_BR pt_BR.utf8 pt_PT pt_PT@​euro pt_PT.utf8 ro_RO ro_RO.utf8 ru_RU ru_RU.koi8r ru_RU.utf8 ru_UA ru_UA.utf8 se_NO se_NO.utf8 sh_YU sh_YU.utf8 sk_SK sk_SK.utf8 sl_SI sl_SI.utf8 sq_AL sq_AL.utf8 sr_CS sr_CS.utf8 sv_FI sv_FI@​euro sv_FI.utf8 sv_SE sv_SE.iso885915 sv_SE.utf8 tg_TJ tg_TJ.utf8 tr_TR tr_TR.utf8 tt_RU.utf8 uk_UA uk_UA.utf8 vi_VN vi_VN.tcvn wa_BE wa_BE@​euro wa_BE.utf8

p5pRT commented 11 years ago

From @JohnPeacock

On 07/07/2013 09​:20 PM\, Karl Williamson wrote​:

If it is the same 07locale.t as is in the core\, 68e8f474bc686a86c064b695b9c7400313d7af65 broke that one. It was relying on buggy behavior. It was failing to 'use locale'. The updated version of that test file is attached; it includes new tests.

It was relying on extremely long-standing /buggy/ behaviour then (going back to the introduction of locale in 5.6.0 in fact). The POD for POSIX makes no mention of having to 'use locale' in order for setlocale to function properly.

In fact\, I would question whether the revised behaviour is correct (though it may be more convenient for _you_). If the only way that setlocale() will function now is inside of 'use locale' then that should be POSIX's responsibility\, not the caller's.

John

p5pRT commented 11 years ago

From @Leont

On Mon\, Jul 8\, 2013 at 4​:42 AM\, John Peacock \john\.peacock@&#8203;havurah\-software\.org wrote​:

It was relying on extremely long-standing /buggy/ behaviour then (going back to the introduction of locale in 5.6.0 in fact). The POD for POSIX makes no mention of having to 'use locale' in order for setlocale to function properly.

perllocale does\, though «By default\, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations.»

In fact\, I would question whether the revised behaviour is correct (though it may be more convenient for _you_). If the only way that setlocale() will function now is inside of 'use locale' then that should be POSIX's responsibility\, not the caller's.

The old behavior is crazy\, especially the way the stringification sticks to the number even after the locale is out of scope. This has been discussed on this list previously.

Leon

p5pRT commented 11 years ago

From @JohnPeacock

On 07/08/2013 09​:20 AM\, Leon Timmermans wrote​:

perllocale does\, though «By default\, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations.»

That sentence doesn't mention setlocale at all\, which is a completely different topic from respecting the ENV locale. Indeed\, theexample code in perllocale for setlocale() itself does not include 'use locale'. Although\, to be fair\, the documentation for LC_NUMERIC does.

The old behavior is crazy\, especially the way the stringification sticks to the number even after the locale is out of scope. This has been discussed on this list previously.

I am not arguing that the old behaviour was completely correct. I am arguing that POSIX​::setlocale() has been changed without warning. Either make POSIX enable 'use locale' globally or load it automatically when locale_h is imported (or better yet only just prior to setlocale/localconv being called).

I do not think that adding documentation to POSIX is more than a thin layer of tissue paper over the real issue. setlocale() has operated with the old (albeit broken) behaviour for 13 years; merely adding documentation is not sufficient\, IMNSHO. I would even consider it proper to emit a fatal error if setlocale() is called outside of a 'use locale' block. Silent changes in behaviour are not useful to anyone.

John

p5pRT commented 11 years ago

From @khwilliamson

On 07/08/2013 04​:03 PM\, John Peacock wrote​:

On 07/08/2013 09​:20 AM\, Leon Timmermans wrote​:

perllocale does\, though «By default\, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations.»

That sentence doesn't mention setlocale at all\, which is a completely different topic from respecting the ENV locale. Indeed\, theexample code in perllocale for setlocale() itself does not include 'use locale'. Although\, to be fair\, the documentation for LC_NUMERIC does.

perllocale also says\, "If you want a Perl application to process and present your data according to a particular locale\, the application code should include the use locale pragma"

The old behavior is crazy\, especially the way the stringification sticks to the number even after the locale is out of scope. This has been discussed on this list previously.

I am not arguing that the old behaviour was completely correct. I am arguing that POSIX​::setlocale() has been changed without warning. Either make POSIX enable 'use locale' globally or load it automatically when locale_h is imported (or better yet only just prior to setlocale/localconv being called).

I do not think that adding documentation to POSIX is more than a thin layer of tissue paper over the real issue. setlocale() has operated with the old (albeit broken) behaviour for 13 years; merely adding documentation is not sufficient\, IMNSHO. I would even consider it proper to emit a fatal error if setlocale() is called outside of a 'use locale' block. Silent changes in behaviour are not useful to anyone.

John

First\, let me make sure there is no misunderstanding here.

The actual behavior of setlocale() has not changed. There were no code changes done with regard to this involving setlocale at all. You can still call setlocale() anywhere in your program\, within or outside the scope of a 'use locale'\, and the underlying locale that the program operates under is immediately changed to correspond. Any POSIX functions\, for example\, that interact with the OS that are called will return based on the changed locale.

What did change is that perl space code no longer pays attention to the LC_NUMERIC category outside 'use locale'. This is the way it has always worked\, AFAIK\, for LC_COLLATE and\, mostly\, LC_CTYPE\, and for some uses of LC_NUMERIC. LC_CTYPE was fixed to completely work this way in 5.14\, without any complaints.

It is also the way it has to work\, or else one gets creepy bugs at a distance. LC_NUMERIC now works like the other categories. It is no longer the outlier causing bugs. People do not expect their 'gt' and 'sort' ops to work differently\, nor do they expect their \w to match differently just because some module that gets loaded has done a setlocale. This is the only sane method of operation. It has long been settled that lexical scoping is a Good Thing™.

Nor was this done without warning to programmers. Near the very beginning of perl5191delta\, under "Incompatible Changes"\, it says

"Locale decimal point character no longer leaks outside of "use locale" scope

"This is actually a bug fix\, but some code has come to rely on the bug being present\, so this change is listed here. The current locale that the program is running under is not supposed to be visible to Perl code except within the scope of a "use locale". However\, until now under certain circumstances\, the character used for a decimal point (often a comma) leaked outside the scope. If your code is affected by this change\, simply add a "use locale"."

But perhaps you meant that there is no run-time warning or error about calling setlocale outside the scope of 'use locale'. Such a warning would only be appropriate if setlocale were called with LC_NUMERIC\, as there is no change in any behavior here for the other categories. Further\, some uses of LC_NUMERIC prior to this change did obey scoping rules\, so the warning would be wrong in some cases even here. Also\, it is legitimate to call setlocale with no intention of ever doing 'use locale' if one is just using POSIX functions\, though that is unlikely.

So\, it's not that the behavior of the system has changed monolithically around setlocale\, 'use locale'. It's that things now behave more consistently across the possible parameters to setlocale().

p5pRT commented 11 years ago

From @JohnPeacock

On 07/09/2013 12​:18 AM\, Karl Williamson wrote​:

Nor was this done without warning to programmers. Near the very beginning of perl5191delta\, under "Incompatible Changes"\, it says

Ah\, ha\, that's where it is. Putting LC_NUMERIC in that block would have made it much easier to spot using grep...

But perhaps you meant that there is no run-time warning or error about calling setlocale outside the scope of 'use locale'. Such a warning would only be appropriate if setlocale were called with LC_NUMERIC\, as there is no change in any behavior here for the other categories.

There is no such documentation associated with the POSIX​::setlocale() function itself\, which is the only thing I thought to check. I had no idea I had to look in perllocale and locale and ultimately perl5191delta to understand what was going on. This is just as much action-at-a-distance as the old behaviour was\, and in some ways worse\, because there is very little way (short of raising a stink on p5p) that anyone would know about it. Most non-p5p'ers don't read perl*delta for fun and profit.

If you are going to go and fix an inconsistency with locale related code\, shouldn't you also document that in the one function in the Perl core where you can change locales in a running script?

Further\, some uses of LC_NUMERIC prior to this change did obey scoping rules\, so the warning would be wrong in some cases even here. Also\, it is legitimate to call setlocale with no intention of ever doing 'use locale' if one is just using POSIX functions\, though that is unlikely.

If I call setlocale() with anything other than an empty string as the second parameter\, shouldn't I reasonably expect _something_ to happen? Or be warned that nothing is going to happen because I didn't add the undocumented 'use locale' line?

So\, it's not that the behavior of the system has changed monolithically around setlocale\, 'use locale'. It's that things now behave more consistently across the possible parameters to setlocale().

Thank you for the excellent summary of why you did what you did. I'm glad that this happened early in the development cycle so we can fix up the tests and improve the documentation before production code is involved.

John

p5pRT commented 11 years ago

From @khwilliamson

On 07/09/2013 05​:00 PM\, John Peacock wrote​:

On 07/09/2013 12​:18 AM\, Karl Williamson wrote​:

Nor was this done without warning to programmers. Near the very beginning of perl5191delta\, under "Incompatible Changes"\, it says

Ah\, ha\, that's where it is. Putting LC_NUMERIC in that block would have made it much easier to spot using grep...

But perhaps you meant that there is no run-time warning or error about calling setlocale outside the scope of 'use locale'. Such a warning would only be appropriate if setlocale were called with LC_NUMERIC\, as there is no change in any behavior here for the other categories.

There is no such documentation associated with the POSIX​::setlocale() function itself\, which is the only thing I thought to check. I had no idea I had to look in perllocale and locale and ultimately perl5191delta to understand what was going on. This is just as much action-at-a-distance as the old behaviour was\, and in some ways worse\, because there is very little way (short of raising a stink on p5p) that anyone would know about it. Most non-p5p'ers don't read perl*delta for fun and profit.

That's unfortunate\, because it really is the only way we know of to communicate with our client base\, and we put quite a bit of effort into getting it right.

If you are going to go and fix an inconsistency with locale related code\, shouldn't you also document that in the one function in the Perl core where you can change locales in a running script?

If I was aware at all of the POSIX​::setlocale() documentation\, it was only dimly. I've always read perllocale\, which contains everything you need to know to use that function. So it did not occur to me to look at the POSIX pod\, as I've been cleaning up the locale documentation over the last couple of years. But you are certainly right\, and I have now pushed to blead (commit 6ea81ccfc852fdeb1f2b4a07ebfbb5ccf65f3239) some changes that I hope adequately address your concerns. The patch is attached to this email for your convenience. If there is something you would like changed\, let me know\, or better\, submit a patch\, which I would promptly consider.

Further\, some uses of LC_NUMERIC prior to this change did obey scoping rules\, so the warning would be wrong in some cases even here. Also\, it is legitimate to call setlocale with no intention of ever doing 'use locale' if one is just using POSIX functions\, though that is unlikely.

If I call setlocale() with anything other than an empty string as the second parameter\, shouldn't I reasonably expect _something_ to happen? Or be warned that nothing is going to happen because I didn't add the undocumented 'use locale' line?

But actually something did happen; perhaps not what someone is expecting given the long history of broken behavior. If the consensus on this list is that we should raise a warning when setlocale() is called outside the scope of "use locale"\, I would quickly implement it.

So\, it's not that the behavior of the system has changed monolithically around setlocale\, 'use locale'. It's that things now behave more consistently across the possible parameters to setlocale().

Thank you for the excellent summary of why you did what you did. I'm glad that this happened early in the development cycle so we can fix up the tests and improve the documentation before production code is involved.

John

p5pRT commented 11 years ago

From @khwilliamson

0001-Better-document-setlocale-use-locale-interactions.patch ```diff From 3c4124e7d0b2d93791e491966e851989e38c0aeb Mon Sep 17 00:00:00 2001 From: Karl Williamson Date: Tue, 9 Jul 2013 18:27:45 -0600 Subject: [PATCH] Better document setlocale, "use locale" interactions The POSIX::setlocale() documentation failed to mention that "use locale" is also usually necessary. --- ext/POSIX/lib/POSIX.pod | 12 +++++++++++- pod/perllocale.pod | 4 +++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/ext/POSIX/lib/POSIX.pod b/ext/POSIX/lib/POSIX.pod index d5e93d0..ff201b8 100644 --- a/ext/POSIX/lib/POSIX.pod +++ b/ext/POSIX/lib/POSIX.pod @@ -727,6 +727,9 @@ for creating hard links into files, see L. Get numeric formatting information. Returns a reference to a hash containing the current locale formatting values. +Users of this function should also read L, where there is a +discussion devoted to this function +(L). Here is how to query the database for the B (Deutsch or German) locale. @@ -1082,7 +1085,14 @@ see L. =item setlocale -Modifies and queries program's locale. The following examples assume +Modifies and queries the program's underlying locale. Users of this +function should also read L, where there is a discussion +devoted to this function (L). +Note that Perl itself is almost entirely unaffected by the locale +except within the scope of S>. (Exceptions are listed +in L.) + +The following examples assume use POSIX qw(setlocale LC_ALL LC_CTYPE); diff --git a/pod/perllocale.pod b/pod/perllocale.pod index 17796ca..d15d4f0 100644 --- a/pod/perllocale.pod +++ b/pod/perllocale.pod @@ -319,7 +319,9 @@ C function: # LC_CTYPE -- explained below # (Showing the testing for success/failure of operations is # omitted in these examples to avoid distracting from the main - # point) + # point; also, any real work would almost certainly have to be + # done within the scope of "use locale".) + use POSIX qw(locale_h); # query and save the old locale -- 1.8.1.3 ```
p5pRT commented 11 years ago

From tchrist@perl.com

Thanks\, Karl.

I do (think!) I understand how it works now\, but if you had asked me five or ten years ago\, I don't know that I would have known the right answer.

--tom

p5pRT commented 11 years ago

From @JohnPeacock

On 07/09/2013 09​:01 PM\, karl williamson via RT wrote​:

If there is something you would like changed\, let me know\, or better\, submit a patch\, which I would promptly consider.

The patch looks good\, though I would have explicitly put 'use locale' in the example code and explained when or why you _wouldn't_ need it\, instead of the other way around. For that matter\, I think that the examples in perllocale should also have 'use locale' explicitly listed for precisely the same reasons. Yes\, the previous paragraph explains that Perl will ignore locales without it\, but if you are using setlocale() for non-trivial purposes\, you need to have 'use locale' in force. Perhaps adding a lexical scope after the initial setlocale() to emphasize that only within that block will Perl respect the locale.

I wonder whether the much better documentation from perllocale on using setlocale() should be in the POSIX POD instead and the references reversed. It is almost always better to maintain locality (no pun intended) between the documentation and the function's actual namespace\, rather than pointing at some other document for the important details.

John

p5pRT commented 11 years ago

From @khwilliamson

On 07/09/2013 07​:52 PM\, John Peacock wrote​:

On 07/09/2013 09​:01 PM\, karl williamson via RT wrote​:

If there is something you would like changed\, let me know\, or better\, submit a patch\, which I would promptly consider.

Here is a patch to the patch.

The patch looks good\, though I would have explicitly put 'use locale' in the example code and explained when or why you _wouldn't_ need it\, instead of the other way around. For that matter\, I think that the examples in perllocale should also have 'use locale' explicitly listed for precisely the same reasons. Yes\, the previous paragraph explains that Perl will ignore locales without it\, but if you are using setlocale() for non-trivial purposes\, you need to have 'use locale' in force. Perhaps adding a lexical scope after the initial setlocale() to emphasize that only within that block will Perl respect the locale.

I changed the new text of the patch to just have a 'use locale'.' I don't understand what the other comments of yours are referring to. I've looked at the blead perllocale\, and I believe it has a 'use locale' everywhere it should have.

I wonder whether the much better documentation from perllocale on using setlocale() should be in the POSIX POD instead and the references reversed. It is almost always better to maintain locality (no pun intended) between the documentation and the function's actual namespace\, rather than pointing at some other document for the important details.

I strongly believe that there should be one pod that deals with all the intricacies of using locales. That pod then would be perllocale. POSIX.pod then should have a quick reference for setlocale\, referring to the other one for more in-depth coverage. I changed the text slightly to emphasize that.

p5pRT commented 11 years ago

From @khwilliamson

0001-More-changes-to-perllocale-and-POSIX.pod-setlocale.patch ```diff From e139caed68876eb28f587d4066a723a38e11fff7 Mon Sep 17 00:00:00 2001 From: Karl Williamson Date: Sun, 14 Jul 2013 20:38:17 -0600 Subject: [PATCH 1/2] More changes to perllocale and POSIX.pod setlocale These address some concerns from John Peacock. --- ext/POSIX/lib/POSIX.pod | 14 +++++++++----- pod/perllocale.pod | 22 ++++++++++++---------- 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/ext/POSIX/lib/POSIX.pod b/ext/POSIX/lib/POSIX.pod index 325d92a..36d999b 100644 --- a/ext/POSIX/lib/POSIX.pod +++ b/ext/POSIX/lib/POSIX.pod @@ -726,9 +726,10 @@ for creating hard links into files, see L. =item localeconv Get numeric formatting information. Returns a reference to a hash -containing the current locale formatting values. -Users of this function should also read L, where there is a -L. +containing the current locale formatting values. Users of this function +should also read L, which provides a comprehensive +discussion of Perl locale handling, including +L. Here is how to query the database for the B (Deutsch or German) locale. @@ -1085,8 +1086,11 @@ see L. =item setlocale Modifies and queries the program's underlying locale. Users of this -function should also read L, where there is a -L. +function should read L, whch provides a comprehensive +discussion of Perl locale handling, knowledge of which is necessary to +properly use this function. It contains +L. +The discussion here is merely a summary reference for C. Note that Perl itself is almost entirely unaffected by the locale except within the scope of S>. (Exceptions are listed in L.) diff --git a/pod/perllocale.pod b/pod/perllocale.pod index c29c9af..e4b86a4 100644 --- a/pod/perllocale.pod +++ b/pod/perllocale.pod @@ -183,8 +183,9 @@ current locale is that which was determined by the L in effect at the start of the program, except that C> is always initialized to the C locale (mentioned under L). -If there is no valid environment, the current locale is undefined. It -is likely, but not necessarily, the "C" locale. +If there is no valid environment, the current locale is whatever the +system default has been set to. It is likely, but not necessarily, the +"C" locale. The operations that are affected by locale are: @@ -319,10 +320,11 @@ C function: # LC_CTYPE -- explained below # (Showing the testing for success/failure of operations is # omitted in these examples to avoid distracting from the main - # point; also, any real work would almost certainly have to be - # done within the scope of "use locale".) + # point use POSIX qw(locale_h); + use locale; + my $old_locale; # query and save the old locale $old_locale = setlocale(LC_CTYPE); @@ -371,14 +373,13 @@ return to the default that was in force when Perl started up: changes to the environment made by the application after startup may or may not be noticed, depending on your system's C library. -If the second argument does not correspond to a valid locale, the locale -for the category is not changed, and the function returns C. - Note that Perl ignores the current C and C locales within the scope of a C. If C fails for some reason (for example an attempt to set -to a locale unknown to the system), C is returned. +to a locale unknown to the system), the locale for the category is not +changed, and the function returns C. + For further information about the categories, consult L. @@ -726,8 +727,9 @@ of your operating system to the next. In short, don't call C directly: let Perl do it for you. Note: C isn't shown in some of these examples because it isn't -needed: C and C exist only to generate locale-dependent -results, and so always obey the current C locale. +needed: C and C are POSIX functions +which use the standard system-supplied C functions that +always obey the current C locale. =head2 Category LC_CTYPE: Character Types -- 1.8.1.3 ```
p5pRT commented 10 years ago

From @khwilliamson

I left this ticket open to see if something would come up\, but now 8 months later\, nothing more has\, so am closing it

-- Karl Williamson

p5pRT commented 10 years ago

@khwilliamson - Status changed from 'open' to 'resolved'