Perl / perl5

đŸȘ The Perl programming language
https://dev.perl.org/perl5/
Other
1.98k stars 559 forks source link

5.17.7 breaks rules of assignment #12663

Open p5pRT opened 11 years ago

p5pRT commented 11 years ago

Migrated from rt.perl.org#116158 (status was 'open')

Searchable as RT116158$

p5pRT commented 11 years ago

From @sisyphus

Hi\, Perlbug report is attached.

In a nutshell\, with 5.17.7​:

$buffer = ‘some string’; $buf = $buffer;

Then call an XSub that writes into $buf (by using sprintf\, say). I then find that $buffer (which should not have changed) has also been written into\, and is the same as $buf.

Cheers\, Rob

p5pRT commented 11 years ago

From @sisyphus

Created by sisyphus@Owner-PC311012

The demo​:

######################################## use strict; use warnings;

#use Inline C => Config => # BUILD_NOISY => 1;

use Inline C => \<\<'EOC';

SV * foo(char * buffer) {   sprintf(buffer\, "%d"\, -123);   return newSVpv(buffer\, 0); }

EOC

my $buffer = 'XOXO' x 3; my $buf = $buffer; print "\$buf​: $buf\n"; print "\$buffer​: $buffer\n";

foo($buf);

print "\$buf​: $buf\n"; print "\$buffer​: $buffer\n"; ########################################

On 5.16 (and earlier) this correctly outputs​:

######################################## $buf​: XOXOXOXOXOXO $buffer​: XOXOXOXOXOXO $buf​: -123 OXOXOXO $buffer​: XOXOXOXOXOXO ########################################

But with 5.17.7\, I'm getting​:

######################################## $buf​: XOXOXOXOXOXO $buffer​: XOXOXOXOXOXO $buf​: -123 OXOXOXO $buffer​: -123 OXOXOXO ########################################

How on earth did $buffer get altered by 5.17.7 ??

Cheers\, Rob

Perl Info ``` Flags: category=core severity=critical Site configuration information for perl 5.16.0: Configured by sisyphus at Fri Nov 2 21:03:22 2012. Summary of my perl5 (revision 5 version 16 subversion 0) configuration: Platform: osname=MSWin32, osvers=6.1, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags =' -s -O2 -DWIN32 -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -fno-strict-aliasing -mms-bitfields', optimize='-s -O2', cppflags='-DWIN32' ccversion='', gccversion='4.7.0', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='long long', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='g++', ldflags ='-s -L"c:\Mingw\perl516\lib\CORE" -L"C:\MinGW\lib"' libpth=C:\MinGW\lib C:\MinGW\msys\1.0\local\lib C:\MinGW\msys\1.0\local\ssl\lib libs=-lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32 perllibs=-lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32 libc=, so=dll, useshrplib=true, libperl=libperl516.a gnulibc_version='' Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-mdll -s -L"c:\Mingw\perl516\lib\CORE" -L"C:\MinGW\lib"' Locally applied patches: @INC for perl 5.16.0: C:/MinGW/perl516/site/lib C:/MinGW/perl516/lib . Environment for perl 5.16.0: HOME (unset) LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=C:\MinGW\perl516\bin;C:\MinGW\perl516\site\bin;C:\MinGW\bin;C:\Program Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\batch;C:\Program Files (x86)\Windows Live\Shared;C:\_32\dmake;C:\MinGW\msys\1.0\bin;C:\Program Files\Windows NT\Accessories;C:\zip PERL_BADLANG (unset) SHELL (unset) ```
p5pRT commented 11 years ago

From @doy

On Thu\, Dec 20\, 2012 at 04​:33​:23PM -0800\, Sisyphus wrote​:

# New Ticket Created by "Sisyphus" # Please include the string​: [perl #116158] # in the subject line of all future correspondence about this issue. # \<URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=116158 >

This is a bug report for perl from sisyphus@​Owner-PC311012\, generated with the help of perlbug 1.39 running under perl 5.16.0.

----------------------------------------------------------------- [Please describe your issue here]

The demo​:

######################################## use strict; use warnings;

#use Inline C => Config => # BUILD_NOISY => 1;

use Inline C => \<\<'EOC';

SV * foo(char * buffer) { sprintf(buffer\, "%d"\, -123); return newSVpv(buffer\, 0); }

EOC

my $buffer = 'XOXO' x 3; my $buf = $buffer; print "\$buf​: $buf\n"; print "\$buffer​: $buffer\n";

foo($buf);

print "\$buf​: $buf\n"; print "\$buffer​: $buffer\n"; ########################################

On 5.16 (and earlier) this correctly outputs​:

######################################## $buf​: XOXOXOXOXOXO $buffer​: XOXOXOXOXOXO $buf​: -123 OXOXOXO $buffer​: XOXOXOXOXOXO ########################################

But with 5.17.7\, I'm getting​:

######################################## $buf​: XOXOXOXOXOXO $buffer​: XOXOXOXOXOXO $buf​: -123 OXOXOXO $buffer​: -123 OXOXOXO ########################################

How on earth did $buffer get altered by 5.17.7 ??

Cheers\, Rob

5.17.7 expanded perl's use of copy on write quite a bit. Inline​::C probably is not respecting copy on write semantics properly in its translation layer for function arguments. If C function arguments are meant to be mutable\, Inline​::C will need to explicitly force a real copy.

-doy

p5pRT commented 11 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 11 years ago

From @sisyphus

-----Original Message----- From​: Jesse Luehrs

The demo​:

######################################## use strict; use warnings;

#use Inline C => Config => # BUILD_NOISY => 1;

use Inline C => \<\<'EOC';

SV * foo(char * buffer) { sprintf(buffer\, "%d"\, -123); return newSVpv(buffer\, 0); }

EOC

my $buffer = 'XOXO' x 3; my $buf = $buffer; print "\$buf​: $buf\n"; print "\$buffer​: $buffer\n";

foo($buf);

print "\$buf​: $buf\n"; print "\$buffer​: $buffer\n"; ########################################

On 5.16 (and earlier) this correctly outputs​:

######################################## $buf​: XOXOXOXOXOXO $buffer​: XOXOXOXOXOXO $buf​: -123 OXOXOXO $buffer​: XOXOXOXOXOXO ########################################

But with 5.17.7\, I'm getting​:

######################################## $buf​: XOXOXOXOXOXO $buffer​: XOXOXOXOXOXO $buf​: -123 OXOXOXO $buffer​: -123 OXOXOXO ########################################

How on earth did $buffer get altered by 5.17.7 ??

5.17.7 expanded perl's use of copy on write quite a bit. Inline​::C probably is not respecting copy on write semantics properly in its translation layer for function arguments. If C function arguments are meant to be mutable\, Inline​::C will need to explicitly force a real copy.

Inline​::C is just the easiest means by which I could demonstrate the 'orrible behaviour.

Attached is a simple extension (Broken-0.01) which also demonstrates the issue. It has no Inline​::C involvement - just plain and simple XS.

Extract\, then run 'perl Makefile.PL'\, followed by 'make test' The test passes on 5.16.0\, but fails on 5.17.7.

Cheers\, Rob

p5pRT commented 11 years ago

From @sisyphus

Broken-0.01.tar.gz

p5pRT commented 11 years ago

From @nwc10

On Thu\, Dec 20\, 2012 at 06​:37​:19PM -0600\, Jesse Luehrs wrote​:

On Thu\, Dec 20\, 2012 at 04​:33​:23PM -0800\, Sisyphus wrote​:

SV * foo(char * buffer) { sprintf(buffer\, "%d"\, -123); return newSVpv(buffer\, 0); }

5.17.7 expanded perl's use of copy on write quite a bit. Inline​::C probably is not respecting copy on write semantics properly in its translation layer for function arguments. If C function arguments are meant to be mutable\, Inline​::C will need to explicitly force a real copy.

Woah.

As will potentially many many other XS modules.

char * is a mutable C function argument. As char * is mutable\, then I suspect that the char * typemap needs to force a real copy.

(A const char * typemap does not)

Nicholas Clark

p5pRT commented 11 years ago

From @tsee

On 12/21/2012 08​:29 AM\, Nicholas Clark wrote​:

On Thu\, Dec 20\, 2012 at 06​:37​:19PM -0600\, Jesse Luehrs wrote​:

On Thu\, Dec 20\, 2012 at 04​:33​:23PM -0800\, Sisyphus wrote​:

SV * foo(char * buffer) { sprintf(buffer\, "%d"\, -123); return newSVpv(buffer\, 0); }

5.17.7 expanded perl's use of copy on write quite a bit. Inline​::C probably is not respecting copy on write semantics properly in its translation layer for function arguments. If C function arguments are meant to be mutable\, Inline​::C will need to explicitly force a real copy.

Woah.

As will potentially many many other XS modules.

char * is a mutable C function argument. As char * is mutable\, then I suspect that the char * typemap needs to force a real copy.

(A const char * typemap does not)

Generally speaking\, I would consider the use of any char* typemap that doesn't also pass string lengths some way as utterly broken. I know it's a very separate problem\, but if we're about to rethink the behaviour of such typemaps\, can we please try to come up with a solution to the even more common problem?

--Steffen

p5pRT commented 11 years ago

From @bulk88

On Fri Dec 21 01​:06​:40 2012\, smueller@​cpan.org wrote​:

Generally speaking\, I would consider the use of any char* typemap that doesn't also pass string lengths some way as utterly broken. I know it's a very separate problem\, but if we're about to rethink the behaviour of such typemaps\, can we please try to come up with a solution to the even more common problem?

--Steffen

https://github.com/bulk88/Win32API-File/blob/641b818af8828a687440020a97aa53c40b977d8e/typemap#L32

This is one way I picked up of doing it. Change that and instead of adding a SV * to preinit\, add a STRLEN. Note compatibility problems with ParseXS's internal API. You can also add vars if in the typemap entry you want to call some C func that wants "int *"s to return additional status info above the return val of the func. Just add more things to PREINIT and use $var as a suffix or prefix. Since the "{}"s will defer the block type INPUT entries until after PREINIT and INPUT declaration statement entries appear in the .c file\, all the PREINIT vars that were injected were already declared before the new scope is opened.

Regarding that a T_PV uses SvPV_nolen and not SvPV and the XS coder doesn't care about the len\, overruns be damned\, I dont think anything should be done about that since no idea of mine seems sane and responsibility falls to the XS coder\, not the interp\, for any bad things the XS code does. Perl's script eval doesn't trap a SEGV. I dont know if anyone would want a Perl where an eval catches a SEGV and resumes execution of the script (libsigsegv\, other similar debugging tools on other OSes). Since bad ideas are better than no ideas\, here are my 2 bad ideas

1. "#error" in T_PV\, force everyone to rewrite their XS code and also define SvPV_nolen to stop compilation if anyone tries to use it.

2. have parse XS add a sentinal to the end of the PV buffer\, if it is changed\, untrappably terminate the process. I did something similar here https://github.com/bulk88/perl5-win32-api/blob/2c640f6e7f27cd6c34ed0be5c214171d88f6abc6/API.xs#L858 . But if I write XS\, I expect performance\, and continuous runtime checking for buffer overflows is a bad idea\, even if its only for XSUBs. I would guess an entry is added to the save stack for check for the overflow\, since not all SVs an XSUB can use come from the Perl stack\, so every SvPV* and sv_grow would check an interp flag if its inside the optree\, or in an XSUB\, if XSUB\, add buffer check to save stack. I don't have any ideas on how to stop overruns on SvPVX.

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

From @cpansprout

On Fri Dec 21 01​:06​:40 2012\, smueller@​cpan.org wrote​:

On 12/21/2012 08​:29 AM\, Nicholas Clark wrote​:

On Thu\, Dec 20\, 2012 at 06​:37​:19PM -0600\, Jesse Luehrs wrote​:

On Thu\, Dec 20\, 2012 at 04​:33​:23PM -0800\, Sisyphus wrote​:

SV * foo(char * buffer) { sprintf(buffer\, "%d"\, -123); return newSVpv(buffer\, 0); }

5.17.7 expanded perl's use of copy on write quite a bit. Inline​::C probably is not respecting copy on write semantics properly in its translation layer for function arguments. If C function arguments are meant to be mutable\, Inline​::C will need to explicitly force a real copy.

Woah.

As will potentially many many other XS modules.

char * is a mutable C function argument. As char * is mutable\, then I suspect that the char * typemap needs to force a real copy.

(A const char * typemap does not)

Generally speaking\, I would consider the use of any char* typemap that doesn't also pass string lengths some way as utterly broken.

Not necessarily. Some code doesn’t have to deal with embedded nulls. The real brokenness occurs when utf8ness is lost.

Can we fix char* typemaps to work the way they did in 5.005 (by using SvPVbyte)? Currently any use of char* typemaps is broken because the callee has no idea how to interpret\, say "\xc4\x80"\, because that could be 1 or 2 characters. (Or was that fixed and I didn’t notice?)

I do find the idea of a mutable char* pointer questionable\, though. What if the original was a reference? Then the mutable string is only temporary.

I know it's a very separate problem\, but if we're about to rethink the behaviour of such typemaps\, can we please try to come up with a solution to the even more common problem?

Well\, fixing this particular copy-on-write case should not depend on that\, since it is a regression.

I suspect it is actually a regression in 5.10\, and can probably be triggered with hash keys\, but I have not tested it.

--

Father Chrysostomos

p5pRT commented 11 years ago

From @jkeenan

It appears that we had *two* tickets in RT titled "5.17* breaks rules of assignment". We closed https://rt-archive.perl.org/perl5/Ticket/Display.html?id=116569 just now.

Could those who participated in the discussion in *this* RT re-evaluate the discussion and make a recommendation as to whether this ticket should be kept open; which issues should have new tickets of their own opened; etc.

Thank you very much. Jim Keenan

p5pRT commented 11 years ago

From @jkeenan

On Sun May 19 19​:32​:46 2013\, jkeenan wrote​:

It appears that we had *two* tickets in RT titled "5.17* breaks rules of assignment". We closed https://rt-archive.perl.org/perl5/Ticket/Display.html?id=116569 just now.

Slight correction​: Sisyphus recommended that ticket be closed\, and I took it in anticipation of closing it within 7 days.