Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.94k stars 554 forks source link

lc() or uc() inside sort affect the return value. #8927

Closed p5pRT closed 17 years ago

p5pRT commented 17 years ago

Migrated from rt.perl.org#43207 (status was 'resolved')

Searchable as RT43207$

p5pRT commented 17 years ago

From @shlomif

Created by @shlomif

If you run​:

\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\< perl -wle 'my $h = "Hello"; print for sort { lc($a) ? 0 : 0 } "$h"\, "";'

You'll get the output of​:

{{{{{ hello }}}}}

Which indicates that the output result of the string "$h" that was given as input to sort were changed as a result of the lc($a) operation. It also happens with uc($a).

It doesn't seem to affect a simple $h\, without the quotes.

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl v5.8.8: Configured by Mandriva at Mon Apr 30 11:27:53 EDT 2007. Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.17-5mdv, archname=i386-linux uname='linux n5.mandriva.com 2.6.17-5mdv #1 smp wed sep 13 14:32:31 edt 2006 i686 intel(r) xeon(tm) cpu 2.80ghz gnulinux ' config_args='-des -Dinc_version_list=5.8.7 5.8.7/i386-linux 5.8.6 5.8.6/i386-linux 5.8.5 5.8.4 5.8.3 5.8.2 5.8.1 5.8.0 5.6.1 5.6.0 -Darchname=i386-linux -Dcc=gcc -Doptimize=-O2 -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr -Dsitebin=/usr/local/bin -Dsiteman1dir=/usr/local/share/man/man1 -Dsiteman3dir=/usr/local/share/man/man3 -Dman3ext=3pm -Dcf_by=Mandriva -Dmyhostname=localhost -Dperladmin=root@localhost -Dcf_email=root@localhost -Dd_dosuid -Ud_csh -Duseshrplib' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables', cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lndbm -lgdbm -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.8/i386-linux/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Locally applied patches: Mandriva Linux patches @INC for perl v5.8.8: /home/shlomi/apps/perl/modules/lib/perl5/site_perl/5.8.8//i386-linux /home/shlomi/apps/perl/modules/lib/perl5/site_perl/5.8.8/ /home/shlomi/apps/perl/modules/lib/perl5/5.8.8/i386-linux /home/shlomi/apps/perl/modules/lib/perl5/5.8.8 /usr/lib/perl5/5.8.8/i386-linux /usr/lib/perl5/5.8.8 /usr/lib/perl5/site_perl/5.8.8/i386-linux /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.8/i386-linux /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.7/i386-linux /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.6/i386-linux /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl . Environment for perl v5.8.8: HOME=/home/shlomi LANG=en_GB.UTF-8 LANGUAGE=en_GB:en LC_ADDRESS=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_IDENTIFICATION=en_GB.UTF-8 LC_MEASUREMENT=en_GB.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_NAME=en_GB.UTF-8 LC_NUMERIC=en_GB.UTF-8 LC_PAPER=en_US.UTF-8 LC_SOURCED=1 LC_TELEPHONE=en_US.UTF-8 LC_TIME=en_GB.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/java/jdk1.5.0_09/bin:/home/shlomi/Download/unpack/graphics/fop/fop-0.93:/home/shlomi/apps/perl/modules/bin:/home/shlomi/apps/latemp/bin:/home/shlomi/apps/file/gringotts/bin:/home/shlomi/apps/gimageview/bin:/home/shlomi/apps/test/quadpres/bin:/home/shlomi/apps/docbook-builder/local/bin:/usr/local/bin:/bin:/usr/bin:/usr/games:/usr/lib/qt3//bin:/home/shlomi/bin:/usr/lib/ssh:/usr/lib/qt3//bin PERL5LIB=/home/shlomi/apps/perl/modules/lib/perl5/site_perl/5.8.8/:/home/shlomi/apps/perl/modules/lib/perl5/5.8.8 PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 17 years ago

From @rgs

On 13/06/07\, via RT Shlomi Fish \perlbug\-followup@&#8203;perl\.org wrote​:

If you run​:

\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\< perl -wle 'my $h = "Hello"; print for sort { lc($a) ? 0 : 0 } "$h"\, "";'

You'll get the output of​:

{{{{{ hello }}}}}

Which indicates that the output result of the string "$h" that was given as input to sort were changed as a result of the lc($a) operation. It also happens with uc($a).

I don't understand your report\, but it looks like you want a stable sort. For that\, look at the sort pragma.

p5pRT commented 17 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 17 years ago

From @Juerd

Rafael Garcia-Suarez skribis 2007-06-13 15​:52 (+0200)​:

perl -wle 'my $h = "Hello"; print for sort { lc($a) ? 0 : 0 } "$h"\, "";' hello I don't understand your report\, but it looks like you want a stable sort. For that\, look at the sort pragma.

"$h" should be Hello\, but hello is printed. Apparently lc($a) mutated $h\, although that effect is not there if $h is used without quotes. -- korajn salutojn\,

  juerd waalboer​: perl hacker \juerd@&#8203;juerd\.nl \<http​://juerd.nl/sig>   convolution​: ict solutions and consultancy \sales@&#8203;convolution\.nl

p5pRT commented 17 years ago

From @rgs

On 13/06/07\, Juerd Waalboer \juerd@&#8203;convolution\.nl wrote​:

Rafael Garcia-Suarez skribis 2007-06-13 15​:52 (+0200)​:

perl -wle 'my $h = "Hello"; print for sort { lc($a) ? 0 : 0 } "$h"\, "";' hello I don't understand your report\, but it looks like you want a stable sort. For that\, look at the sort pragma.

"$h" should be Hello\, but hello is printed. Apparently lc($a) mutated $h\, although that effect is not there if $h is used without quotes.

aah\, yes. Very odd.

p5pRT commented 17 years ago

From a.r.ferreira@gmail.com

On 6/13/07\, Rafael Garcia-Suarez \rgarciasuarez@&#8203;gmail\.com wrote​:

On 13/06/07\, via RT Shlomi Fish \perlbug\-followup@&#8203;perl\.org wrote​:

If you run​:

\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\< perl -wle 'my $h = "Hello"; print for sort { lc($a) ? 0 : 0 } "$h"\, "";'

You'll get the output of​:

{{{{{ hello }}}}}

Which indicates that the output result of the string "$h" that was given as input to sort were changed as a result of the lc($a) operation. It also happens with uc($a).

I don't understand your report\, but it looks like you want a stable sort. For that\, look at the sort pragma.

Nope. The issue was lc() changing its argument. The example with sort was only a way to demonstrate it.

Here's another​:

$ perl -e ' $a = "Hello"; for ("$a") { lc $_; print }' hello

When the "$a" gets aliased\, it is being changed in place.

It does not happen without quotes or with constants​:

$ perl -e ' $a = "Hello"; for ($a) { lc $_; print }' Hello

$ perl -e ' for ("Hello") { lc $_; print }' Hello

p5pRT commented 17 years ago

From @rgs

On 13/06/07\, Adriano Ferreira \a\.r\.ferreira@&#8203;gmail\.com wrote​:

Nope. The issue was lc() changing its argument. The example with sort was only a way to demonstrate it.

Here's another​:

$ perl -e ' $a = "Hello"; for ("$a") { lc $_; print }' hello

Yes\, it seems that the logic in pp_lc is flawed​:

  if (SvPADTMP(source) && !SvREADONLY(source) && !SvAMAGIC(source)   && !DO_UTF8(source)) {   /* We can convert in place. */   dest = source;

Maybe we should add an SvTEMP check here too...

p5pRT commented 17 years ago

@rgs - Status changed from 'open' to 'resolved'

p5pRT commented 17 years ago

From @rgs

On 13/06/07\, Rafael Garcia-Suarez \rgarciasuarez@&#8203;gmail\.com wrote​:

On 13/06/07\, Adriano Ferreira \a\.r\.ferreira@&#8203;gmail\.com wrote​:

Nope. The issue was lc() changing its argument. The example with sort was only a way to demonstrate it.

Here's another​:

$ perl -e ' $a = "Hello"; for ("$a") { lc $_; print }' hello

Yes\, it seems that the logic in pp_lc is flawed​:

if \(SvPADTMP\(source\) && \!SvREADONLY\(source\) && \!SvAMAGIC\(source\)
    && \!DO\_UTF8\(source\)\) \{
    /\* We can convert in place\.  \*/
    dest = source;

Maybe we should add an SvTEMP check here too...

Now added (and to uc/lcfirst/ucfirst too) as change #31377.

p5pRT commented 12 years ago

From @cpansprout

On Thu Jun 14 04​:26​:54 2007\, rafael wrote​:

On 13/06/07\, Rafael Garcia-Suarez \rgarciasuarez@&#8203;gmail\.com wrote​:

On 13/06/07\, Adriano Ferreira \a\.r\.ferreira@&#8203;gmail\.com wrote​:

Nope. The issue was lc() changing its argument. The example with sort was only a way to demonstrate it.

Here's another​:

$ perl -e ' $a = "Hello"; for ("$a") { lc $_; print }' hello

Yes\, it seems that the logic in pp_lc is flawed​:

if \(SvPADTMP\(source\) && \!SvREADONLY\(source\) && \!SvAMAGIC\(source\)
    && \!DO\_UTF8\(source\)\) \{
    /\* We can convert in place\.  \*/
    dest = source;

Maybe we should add an SvTEMP check here too...

Now added (and to uc/lcfirst/ucfirst too) as change #31377.

(aka 17fa077605)

But PADTMPs are never SvTEMP\, so now this optimisation never happens. I tried adding assert(0)\, and all tests passed.

This bug is actually related to #78194. If we make for() and ()x... (and other operations) copy the PADTMP ahead of time\, preventing it ever from appearing on the stack twice\, this bug would be fixed as a result\, and 17fa077605 could be reverted.

--

Father Chrysostomos

p5pRT commented 10 years ago

From @cpansprout

On Sun Jul 29 11​:04​:32 2012\, sprout wrote​:

On Thu Jun 14 04​:26​:54 2007\, rafael wrote​:

On 13/06/07\, Rafael Garcia-Suarez \rgarciasuarez@&#8203;gmail\.com wrote​:

On 13/06/07\, Adriano Ferreira \a\.r\.ferreira@&#8203;gmail\.com wrote​:

Nope. The issue was lc() changing its argument. The example with sort was only a way to demonstrate it.

Here's another​:

$ perl -e ' $a = "Hello"; for ("$a") { lc $_; print }' hello

Yes\, it seems that the logic in pp_lc is flawed​:

if \(SvPADTMP\(source\) && \!SvREADONLY\(source\) && \!SvAMAGIC\(source\)
    && \!DO\_UTF8\(source\)\) \{
    /\* We can convert in place\.  \*/
    dest = source;

Maybe we should add an SvTEMP check here too...

Now added (and to uc/lcfirst/ucfirst too) as change #31377.

(aka 17fa077605)

But PADTMPs are never SvTEMP\, so now this optimisation never happens. I tried adding assert(0)\, and all tests passed.

This bug is actually related to #78194. If we make for() and ()x... (and other operations) copy the PADTMP ahead of time\, preventing it ever from appearing on the stack twice\, this bug would be fixed as a result\, and 17fa077605 could be reverted.

I have reënabled in-place uc/lc in commit 5cd5e2d630.

--

Father Chrysostomos