Closed p5pRT closed 10 years ago
A recent CPAN smoke report:
http://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74-6f4c697f9654
pointed to a regression in LC_NUMERIC that affects the version.pm tests.
I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).
John
On Sun Jul 07 17:17:00 2013\, john.peacock@havurah-software.org wrote:
This is a bug report for perl from jpeacock@cpan.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.
----------------------------------------------------------------- [Please describe your issue here]
A recent CPAN smoke report:
http://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74- 6f4c697f9654
pointed to a regression in LC_NUMERIC that affects the version.pm tests.
I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).
John
Is it possible for you to test this against blead (or git bisect)?
I would suggest at this commit: 3a910aa05dd92ca95ba9356954fc38a67acba3f0
And then again at this commit: 9a2ee86b20e06a3f64ee243168245198de7e07e4
In between\, there were several commits where 'locale' appeared in the commit message.
Thank you very much. Jim Keenan
The RT System itself - Status changed from 'new' to 'open'
On Sun Jul 07 17:58:53 2013\, jkeenan wrote:
On Sun Jul 07 17:17:00 2013\, john.peacock@havurah-software.org wrote:
This is a bug report for perl from jpeacock@cpan.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.
----------------------------------------------------------------- [Please describe your issue here]
A recent CPAN smoke report:
http://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74- 6f4c697f9654
pointed to a regression in LC_NUMERIC that affects the version.pm tests.
I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).
John
Is it possible for you to test this against blead (or git bisect)?
I would suggest at this commit: 3a910aa05dd92ca95ba9356954fc38a67acba3f0
And then again at this commit: 9a2ee86b20e06a3f64ee243168245198de7e07e4
In between\, there were several commits where 'locale' appeared in the commit message.
Better still\, re-test at the break between PASS and FAIL below:
e92587081904f9cc586bf4b6898fec2187a5a95a PASS b127e37e51c21b0a36755dcd19811be931a03d83 PASS 68e8f474bc686a86c064b695b9c7400313d7af65 FAIL c1b30ad9d14ad1617d1aab860bdd5cf4ac66833a FAIL 3a910aa05dd92ca95ba9356954fc38a67acba3f0 FAIL 9a2ee86b20e06a3f64ee243168245198de7e07e4 FAIL
On 07/07/2013 06:58 PM\, James E Keenan via RT wrote:
On Sun Jul 07 17:17:00 2013\, john.peacock@havurah-software.org wrote:
This is a bug report for perl from jpeacock@cpan.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.
----------------------------------------------------------------- [Please describe your issue here]
A recent CPAN smoke report:
http://www.cpantesters.org/cpan/report/6efd9ed4-dee4-11e2-8b74- 6f4c697f9654
pointed to a regression in LC_NUMERIC that affects the version.pm tests.
I have a cut down test from the version.pm distro (that doesn't rely on anything except setlocale) attached. I don't believe that I am doing anything unsupported (and this test has worked up until 5.19.1 was released).
John
Is it possible for you to test this against blead (or git bisect)?
I would suggest at this commit: 3a910aa05dd92ca95ba9356954fc38a67acba3f0
And then again at this commit: 9a2ee86b20e06a3f64ee243168245198de7e07e4
In between\, there were several commits where 'locale' appeared in the commit message.
Thank you very much. Jim Keenan
--- via perlbug: queue: perl5 status: new https://rt-archive.perl.org/perl5/Ticket/Display.html?id=118795
If it is the same 07locale.t as is in the core\, 68e8f474bc686a86c064b695b9c7400313d7af65 broke that one. It was relying on buggy behavior. It was failing to 'use locale'. The updated version of that test file is attached; it includes new tests.
#! /usr/local/perl -w # Before `make install' is performed this script should be runnable with # `make test'. After `make install' it should work as `perl test.pl'
#########################
use File::Basename; use File::Temp qw/tempfile/; use POSIX qw/locale_h/; use Test::More tests => 9; use Config;
BEGIN { use_ok('version'\, 0.9902); }
SKIP: { skip 'No locale testing for Perl \< 5.6.0'\, 8 if $] \< 5.006; skip 'No locale testing without d_setlocale'\, 8 if(!$Config{d_setlocale}); # test locale handling my $warning;
use locale;
local $SIG{__WARN__} = sub { $warning = $_[0] };
my $ver = 1.23; # has to be floating point number my $loc; my $orig_loc = setlocale(LC_NUMERIC); is ($ver\, '1.23'\, 'Not using locale yet'); while (\) { chomp; $loc = setlocale( LC_ALL\, $_); last if localeconv()->{decimal_point} eq '\,'; } skip 'Cannot test locale handling without a comma locale'\, 7 unless $loc and localeconv()->{decimal_point} eq '\,';
diag ("Testing locale handling with $loc") unless $ENV{PERL_CORE};
setlocale(LC_NUMERIC\, $loc); ok ("$ver eq '1\,23'"\, "Using locale: $loc"); $v = version->new($ver); unlike($warning\, qr/Version string '1\,23' contains invalid data/\, "Process locale-dependent floating point"); ok ($v == "1.23"\, "Locale doesn't apply to version objects"); ok ($v == $ver\, "Comparison to locale floating point");
{ no locale; ok ("$ver eq '1.23'"\, "Outside of scope of use locale"); }
ok("\"$ver\"+1 gt 2.22" && \"$ver\"+1 lt 2.24"\, "Can do math when radix is not a dot"); # [perl 115800]
setlocale( LC_ALL\, $orig_loc); # reset this before possible skip skip 'Cannot test RT#46921 with Perl \< 5.008'\, 1 if ($] \< 5.008); skip 'Cannot test RT#46921 with pure Perl module'\, 1 if exists $INC{'version/vpp.pm'}; my ($fh\, $filename) = tempfile('tXXXXXXX'\, SUFFIX => '.pm'\, UNLINK => 1); (my $package = basename($filename)) =~ s/\.pm$//; print $fh \<\<"EOF"; package $package; use locale; use POSIX qw(locale_h); \$^W = 1; use version; setlocale (LC_ALL\, '$loc'); use version ; eval "use Socket 1.7"; setlocale( LC_ALL\, '$orig_loc'); 1; EOF close $fh;
eval "use lib '.'; use $package;"; unlike($warning\, qr"Version string '1\,7' contains invalid data"\, 'Handle locale action-at-a-distance'); }
__DATA__ af_ZA af_ZA.utf8 an_ES an_ES.utf8 az_AZ.utf8 be_BY be_BY.utf8 bg_BG bg_BG.utf8 br_FR br_FR@euro br_FR.utf8 bs_BA bs_BA.utf8 ca_ES ca_ES@euro ca_ES.utf8 cs_CZ cs_CZ.utf8 da_DK da_DK.utf8 de_AT de_AT@euro de_AT.utf8 de_BE de_BE@euro de_BE.utf8 de_DE de_DE@euro de_DE.utf8 de_LU de_LU@euro de_LU.utf8 el_GR el_GR.utf8 en_DK en_DK.utf8 es_AR es_AR.utf8 es_BO es_BO.utf8 es_CL es_CL.utf8 es_CO es_CO.utf8 es_EC es_EC.utf8 es_ES es_ES@euro es_ES.utf8 es_PY es_PY.utf8 es_UY es_UY.utf8 es_VE es_VE.utf8 et_EE et_EE.iso885915 et_EE.utf8 eu_ES eu_ES@euro eu_ES.utf8 fi_FI fi_FI@euro fi_FI.utf8 fo_FO fo_FO.utf8 fr_BE fr_BE@euro fr_BE.utf8 fr_CA fr_CA.utf8 fr_CH fr_CH.utf8 fr_FR fr_FR@euro fr_FR.utf8 fr_LU fr_LU@euro fr_LU.utf8 gl_ES gl_ES@euro gl_ES.utf8 hr_HR hr_HR.utf8 hu_HU hu_HU.utf8 id_ID id_ID.utf8 is_IS is_IS.utf8 it_CH it_CH.utf8 it_IT it_IT@euro it_IT.utf8 ka_GE ka_GE.utf8 kk_KZ kk_KZ.utf8 kl_GL kl_GL.utf8 lt_LT lt_LT.utf8 lv_LV lv_LV.utf8 mk_MK mk_MK.utf8 mn_MN mn_MN.utf8 nb_NO nb_NO.utf8 nl_BE nl_BE@euro nl_BE.utf8 nl_NL nl_NL@euro nl_NL.utf8 nn_NO nn_NO.utf8 no_NO no_NO.utf8 oc_FR oc_FR.utf8 pl_PL pl_PL.utf8 pt_BR pt_BR.utf8 pt_PT pt_PT@euro pt_PT.utf8 ro_RO ro_RO.utf8 ru_RU ru_RU.koi8r ru_RU.utf8 ru_UA ru_UA.utf8 se_NO se_NO.utf8 sh_YU sh_YU.utf8 sk_SK sk_SK.utf8 sl_SI sl_SI.utf8 sq_AL sq_AL.utf8 sr_CS sr_CS.utf8 sv_FI sv_FI@euro sv_FI.utf8 sv_SE sv_SE.iso885915 sv_SE.utf8 tg_TJ tg_TJ.utf8 tr_TR tr_TR.utf8 tt_RU.utf8 uk_UA uk_UA.utf8 vi_VN vi_VN.tcvn wa_BE wa_BE@euro wa_BE.utf8
On 07/07/2013 09:20 PM\, Karl Williamson wrote:
If it is the same 07locale.t as is in the core\, 68e8f474bc686a86c064b695b9c7400313d7af65 broke that one. It was relying on buggy behavior. It was failing to 'use locale'. The updated version of that test file is attached; it includes new tests.
It was relying on extremely long-standing /buggy/ behaviour then (going back to the introduction of locale in 5.6.0 in fact). The POD for POSIX makes no mention of having to 'use locale' in order for setlocale to function properly.
In fact\, I would question whether the revised behaviour is correct (though it may be more convenient for _you_). If the only way that setlocale() will function now is inside of 'use locale' then that should be POSIX's responsibility\, not the caller's.
John
On Mon\, Jul 8\, 2013 at 4:42 AM\, John Peacock \john\.peacock@​havurah\-software\.org wrote:
It was relying on extremely long-standing /buggy/ behaviour then (going back to the introduction of locale in 5.6.0 in fact). The POD for POSIX makes no mention of having to 'use locale' in order for setlocale to function properly.
perllocale does\, though «By default\, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations.»
In fact\, I would question whether the revised behaviour is correct (though it may be more convenient for _you_). If the only way that setlocale() will function now is inside of 'use locale' then that should be POSIX's responsibility\, not the caller's.
The old behavior is crazy\, especially the way the stringification sticks to the number even after the locale is out of scope. This has been discussed on this list previously.
Leon
On 07/08/2013 09:20 AM\, Leon Timmermans wrote:
perllocale does\, though «By default\, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations.»
That sentence doesn't mention setlocale at all\, which is a completely different topic from respecting the ENV locale. Indeed\, theexample code in perllocale for setlocale() itself does not include 'use locale'. Although\, to be fair\, the documentation for LC_NUMERIC does.
The old behavior is crazy\, especially the way the stringification sticks to the number even after the locale is out of scope. This has been discussed on this list previously.
I am not arguing that the old behaviour was completely correct. I am arguing that POSIX::setlocale() has been changed without warning. Either make POSIX enable 'use locale' globally or load it automatically when locale_h is imported (or better yet only just prior to setlocale/localconv being called).
I do not think that adding documentation to POSIX is more than a thin layer of tissue paper over the real issue. setlocale() has operated with the old (albeit broken) behaviour for 13 years; merely adding documentation is not sufficient\, IMNSHO. I would even consider it proper to emit a fatal error if setlocale() is called outside of a 'use locale' block. Silent changes in behaviour are not useful to anyone.
John
On 07/08/2013 04:03 PM\, John Peacock wrote:
On 07/08/2013 09:20 AM\, Leon Timmermans wrote:
perllocale does\, though «By default\, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations.»
That sentence doesn't mention setlocale at all\, which is a completely different topic from respecting the ENV locale. Indeed\, theexample code in perllocale for setlocale() itself does not include 'use locale'. Although\, to be fair\, the documentation for LC_NUMERIC does.
perllocale also says\, "If you want a Perl application to process and present your data according to a particular locale\, the application code should include the use locale pragma"
The old behavior is crazy\, especially the way the stringification sticks to the number even after the locale is out of scope. This has been discussed on this list previously.
I am not arguing that the old behaviour was completely correct. I am arguing that POSIX::setlocale() has been changed without warning. Either make POSIX enable 'use locale' globally or load it automatically when locale_h is imported (or better yet only just prior to setlocale/localconv being called).
I do not think that adding documentation to POSIX is more than a thin layer of tissue paper over the real issue. setlocale() has operated with the old (albeit broken) behaviour for 13 years; merely adding documentation is not sufficient\, IMNSHO. I would even consider it proper to emit a fatal error if setlocale() is called outside of a 'use locale' block. Silent changes in behaviour are not useful to anyone.
John
First\, let me make sure there is no misunderstanding here.
The actual behavior of setlocale() has not changed. There were no code changes done with regard to this involving setlocale at all. You can still call setlocale() anywhere in your program\, within or outside the scope of a 'use locale'\, and the underlying locale that the program operates under is immediately changed to correspond. Any POSIX functions\, for example\, that interact with the OS that are called will return based on the changed locale.
What did change is that perl space code no longer pays attention to the LC_NUMERIC category outside 'use locale'. This is the way it has always worked\, AFAIK\, for LC_COLLATE and\, mostly\, LC_CTYPE\, and for some uses of LC_NUMERIC. LC_CTYPE was fixed to completely work this way in 5.14\, without any complaints.
It is also the way it has to work\, or else one gets creepy bugs at a distance. LC_NUMERIC now works like the other categories. It is no longer the outlier causing bugs. People do not expect their 'gt' and 'sort' ops to work differently\, nor do they expect their \w to match differently just because some module that gets loaded has done a setlocale. This is the only sane method of operation. It has long been settled that lexical scoping is a Good Thing™.
Nor was this done without warning to programmers. Near the very beginning of perl5191delta\, under "Incompatible Changes"\, it says
"Locale decimal point character no longer leaks outside of "use locale" scope
"This is actually a bug fix\, but some code has come to rely on the bug being present\, so this change is listed here. The current locale that the program is running under is not supposed to be visible to Perl code except within the scope of a "use locale". However\, until now under certain circumstances\, the character used for a decimal point (often a comma) leaked outside the scope. If your code is affected by this change\, simply add a "use locale"."
But perhaps you meant that there is no run-time warning or error about calling setlocale outside the scope of 'use locale'. Such a warning would only be appropriate if setlocale were called with LC_NUMERIC\, as there is no change in any behavior here for the other categories. Further\, some uses of LC_NUMERIC prior to this change did obey scoping rules\, so the warning would be wrong in some cases even here. Also\, it is legitimate to call setlocale with no intention of ever doing 'use locale' if one is just using POSIX functions\, though that is unlikely.
So\, it's not that the behavior of the system has changed monolithically around setlocale\, 'use locale'. It's that things now behave more consistently across the possible parameters to setlocale().
On 07/09/2013 12:18 AM\, Karl Williamson wrote:
Nor was this done without warning to programmers. Near the very beginning of perl5191delta\, under "Incompatible Changes"\, it says
Ah\, ha\, that's where it is. Putting LC_NUMERIC in that block would have made it much easier to spot using grep...
But perhaps you meant that there is no run-time warning or error about calling setlocale outside the scope of 'use locale'. Such a warning would only be appropriate if setlocale were called with LC_NUMERIC\, as there is no change in any behavior here for the other categories.
There is no such documentation associated with the POSIX::setlocale() function itself\, which is the only thing I thought to check. I had no idea I had to look in perllocale and locale and ultimately perl5191delta to understand what was going on. This is just as much action-at-a-distance as the old behaviour was\, and in some ways worse\, because there is very little way (short of raising a stink on p5p) that anyone would know about it. Most non-p5p'ers don't read perl*delta for fun and profit.
If you are going to go and fix an inconsistency with locale related code\, shouldn't you also document that in the one function in the Perl core where you can change locales in a running script?
Further\, some uses of LC_NUMERIC prior to this change did obey scoping rules\, so the warning would be wrong in some cases even here. Also\, it is legitimate to call setlocale with no intention of ever doing 'use locale' if one is just using POSIX functions\, though that is unlikely.
If I call setlocale() with anything other than an empty string as the second parameter\, shouldn't I reasonably expect _something_ to happen? Or be warned that nothing is going to happen because I didn't add the undocumented 'use locale' line?
So\, it's not that the behavior of the system has changed monolithically around setlocale\, 'use locale'. It's that things now behave more consistently across the possible parameters to setlocale().
Thank you for the excellent summary of why you did what you did. I'm glad that this happened early in the development cycle so we can fix up the tests and improve the documentation before production code is involved.
John
On 07/09/2013 05:00 PM\, John Peacock wrote:
On 07/09/2013 12:18 AM\, Karl Williamson wrote:
Nor was this done without warning to programmers. Near the very beginning of perl5191delta\, under "Incompatible Changes"\, it says
Ah\, ha\, that's where it is. Putting LC_NUMERIC in that block would have made it much easier to spot using grep...
But perhaps you meant that there is no run-time warning or error about calling setlocale outside the scope of 'use locale'. Such a warning would only be appropriate if setlocale were called with LC_NUMERIC\, as there is no change in any behavior here for the other categories.
There is no such documentation associated with the POSIX::setlocale() function itself\, which is the only thing I thought to check. I had no idea I had to look in perllocale and locale and ultimately perl5191delta to understand what was going on. This is just as much action-at-a-distance as the old behaviour was\, and in some ways worse\, because there is very little way (short of raising a stink on p5p) that anyone would know about it. Most non-p5p'ers don't read perl*delta for fun and profit.
That's unfortunate\, because it really is the only way we know of to communicate with our client base\, and we put quite a bit of effort into getting it right.
If you are going to go and fix an inconsistency with locale related code\, shouldn't you also document that in the one function in the Perl core where you can change locales in a running script?
If I was aware at all of the POSIX::setlocale() documentation\, it was only dimly. I've always read perllocale\, which contains everything you need to know to use that function. So it did not occur to me to look at the POSIX pod\, as I've been cleaning up the locale documentation over the last couple of years. But you are certainly right\, and I have now pushed to blead (commit 6ea81ccfc852fdeb1f2b4a07ebfbb5ccf65f3239) some changes that I hope adequately address your concerns. The patch is attached to this email for your convenience. If there is something you would like changed\, let me know\, or better\, submit a patch\, which I would promptly consider.
Further\, some uses of LC_NUMERIC prior to this change did obey scoping rules\, so the warning would be wrong in some cases even here. Also\, it is legitimate to call setlocale with no intention of ever doing 'use locale' if one is just using POSIX functions\, though that is unlikely.
If I call setlocale() with anything other than an empty string as the second parameter\, shouldn't I reasonably expect _something_ to happen? Or be warned that nothing is going to happen because I didn't add the undocumented 'use locale' line?
But actually something did happen; perhaps not what someone is expecting given the long history of broken behavior. If the consensus on this list is that we should raise a warning when setlocale() is called outside the scope of "use locale"\, I would quickly implement it.
So\, it's not that the behavior of the system has changed monolithically around setlocale\, 'use locale'. It's that things now behave more consistently across the possible parameters to setlocale().
Thank you for the excellent summary of why you did what you did. I'm glad that this happened early in the development cycle so we can fix up the tests and improve the documentation before production code is involved.
John
Thanks\, Karl.
I do (think!) I understand how it works now\, but if you had asked me five or ten years ago\, I don't know that I would have known the right answer.
--tom
On 07/09/2013 09:01 PM\, karl williamson via RT wrote:
If there is something you would like changed\, let me know\, or better\, submit a patch\, which I would promptly consider.
The patch looks good\, though I would have explicitly put 'use locale' in the example code and explained when or why you _wouldn't_ need it\, instead of the other way around. For that matter\, I think that the examples in perllocale should also have 'use locale' explicitly listed for precisely the same reasons. Yes\, the previous paragraph explains that Perl will ignore locales without it\, but if you are using setlocale() for non-trivial purposes\, you need to have 'use locale' in force. Perhaps adding a lexical scope after the initial setlocale() to emphasize that only within that block will Perl respect the locale.
I wonder whether the much better documentation from perllocale on using setlocale() should be in the POSIX POD instead and the references reversed. It is almost always better to maintain locality (no pun intended) between the documentation and the function's actual namespace\, rather than pointing at some other document for the important details.
John
On 07/09/2013 07:52 PM\, John Peacock wrote:
On 07/09/2013 09:01 PM\, karl williamson via RT wrote:
If there is something you would like changed\, let me know\, or better\, submit a patch\, which I would promptly consider.
Here is a patch to the patch.
The patch looks good\, though I would have explicitly put 'use locale' in the example code and explained when or why you _wouldn't_ need it\, instead of the other way around. For that matter\, I think that the examples in perllocale should also have 'use locale' explicitly listed for precisely the same reasons. Yes\, the previous paragraph explains that Perl will ignore locales without it\, but if you are using setlocale() for non-trivial purposes\, you need to have 'use locale' in force. Perhaps adding a lexical scope after the initial setlocale() to emphasize that only within that block will Perl respect the locale.
I changed the new text of the patch to just have a 'use locale'.' I don't understand what the other comments of yours are referring to. I've looked at the blead perllocale\, and I believe it has a 'use locale' everywhere it should have.
I wonder whether the much better documentation from perllocale on using setlocale() should be in the POSIX POD instead and the references reversed. It is almost always better to maintain locality (no pun intended) between the documentation and the function's actual namespace\, rather than pointing at some other document for the important details.
I strongly believe that there should be one pod that deals with all the intricacies of using locales. That pod then would be perllocale. POSIX.pod then should have a quick reference for setlocale\, referring to the other one for more in-depth coverage. I changed the text slightly to emphasize that.
I left this ticket open to see if something would come up\, but now 8 months later\, nothing more has\, so am closing it
-- Karl Williamson
@khwilliamson - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#118795 (status was 'resolved')
Searchable as RT118795$