Closed p5pRT closed 12 years ago
5.14 coredumps during perl -c for me with following scripts. with perl 5.10\, 5.12 perl -c show only syntax errors as it must be. I don't checked it with version > 5.14.2
Try to run one of the two scripts\, one of them should crash perl.
# --- script #1 #!/usr/bin/perl use strict; use warnings; sub meow (&); my %h; my $k;
meow { my $t : need_this; $t = { size => $h{$k}{size}; used => $h{$k}(used} }; }; # --- end of script #1
# --- script #2 #!/usr/bin/perl
use strict; use warnings;
sub meow (&);
my %h; my $k;
meow { my $t : need_this; $t = { size => $h{$k}{size}; used => $h{$k}(used} }; };
sub testo { my $value = shift; print; print; print; 1; }
# --- end of script #2 or links: script #1: https://gist.github.com/2318879 script #2: https://gist.github.com/2319125
results look like this: # perl -c script(1|2).pl Segmentation fault (core dumped)
On Fri Apr 06 10:59:31 2012\, azus wrote:
This is a bug report for perl from andrej.zverev@gmail.com\, generated with the help of perlbug 1.39 running under perl 5.14.2.
----------------------------------------------------------------- [Please describe your issue here]
5.14 coredumps during perl -c for me with following scripts. with perl 5.10\, 5.12 perl -c show only syntax errors as it must be. I don't checked it with version > 5.14.2
Try to run one of the two scripts\, one of them should crash perl.
# --- script #1 #!/usr/bin/perl use strict; use warnings; sub meow (&); my %h; my $k;
meow { my $t : need_this; $t = { size => $h{$k}{size}; used => $h{$k}(used} }; };
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Thank you very much. Jim Keenan
The RT System itself - Status changed from 'new' to 'open'
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Yes\, there are two syntax errors but this is not a reason for segfault. Since 5.10 and 5.12 eat this fine.
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Yes\, there are two syntax errors but this is not a reason for segfault. Since 5.10 and 5.12 eat this fine.
On 06/04/2012 22:27\, James E Keenan via RT wrote:
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Thank you very much. Jim Keenan
--- via perlbug: queue: perl5 status: new https://rt-archive.perl.org/perl5/Ticket/Display.html?id=112312
perl shouldn't crash\, regardless of whether the code is valid or not.
I can confirm the segfault with a perl built with PERL_POISON defined (otherwise my system's libc isn't sensitive enough to catch it). 5.12.4 doesn't crash\, but 5.14.2\, 5.15.3 and 5.15.6 do. Here's a stacktrace for perl 5.14.2 :
$ gdb --args perl5.14.2-dbg-psn-thr-64 x.pl GNU gdb (Gentoo 7.4 p1) 7.4 Copyright (C) 2012 Free Software Foundation\, Inc. License GPLv3+: GNU GPL version 3 or later \<http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY\, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". For bug reporting instructions\, please see: \<http://bugs.gentoo.org/>... Reading symbols from /home/vince/perl/builds/bin/perl5.14.2-dbg-psn-thr-64...done. (gdb) r Starting program: /home/vince/perl/builds/bin/perl5.14.2-dbg-psn-thr-64 x.pl [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV\, Segmentation fault. 0x00000000004d7e84 in Perl_pad_free (my_perl=0xa86010\, po=11354992) at pad.c:1498 1498 if (PL_curpad[po] && PL_curpad[po] != &PL_sv_undef) { (gdb) bt #0 0x00000000004d7e84 in Perl_pad_free (my_perl=0xa86010\, po=11354992) at pad.c:1498 #1 0x000000000041dff2 in Perl_op_clear (my_perl=0xa86010\, o=0xab8aa0) at op.c:713 #2 0x000000000041d9d9 in Perl_op_free (my_perl=0xa86010\, o=0xab8aa0) at op.c:528 #3 0x00000000004d02a1 in Perl_yyparse (my_perl=0xa86010\, gramtype=258) at perly.c:678 #4 0x00000000004529aa in S_parse_body (my_perl=0xa86010\, env=0x0\, xsinit=0x41cf02 \<xs_init>) at perl.c:2194 #5 0x0000000000450a30 in perl_parse (my_perl=0xa86010\, xsinit=0x41cf02 \<xs_init>\, argc=2\, argv=0x7fffffffde88\, env=0x0) at perl.c:1613 #6 0x000000000041ce45 in main (argc=2\, argv=0x7fffffffde88\, env=0x7fffffffdea0) at perlmain.c:118
Vincent.
On Fri Apr 06 13:43:56 2012\, perl@profvince.com wrote:
On 06/04/2012 22:27\, James E Keenan via RT wrote:
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Thank you very much. Jim Keenan
--- via perlbug: queue: perl5 status: new https://rt-archive.perl.org/perl5/Ticket/Display.html?id=112312
perl shouldn't crash\, regardless of whether the code is valid or not.
I can confirm the segfault with a perl built with PERL_POISON defined (otherwise my system's libc isn't sensitive enough to catch it). 5.12.4 doesn't crash\, but 5.14.2\, 5.15.3 and 5.15.6 do. Here's a stacktrace for perl 5.14.2 :
Can we make this a 5.16 blocker?
--
Father Chrysostomos
On Fri Apr 06 13:43:56 2012\, perl@profvince.com wrote:
On 06/04/2012 22:27\, James E Keenan via RT wrote:
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Thank you very much. Jim Keenan
--- via perlbug: queue: perl5 status: new https://rt-archive.perl.org/perl5/Ticket/Display.html?id=112312
perl shouldn't crash\, regardless of whether the code is valid or not.
I can confirm the segfault with a perl built with PERL_POISON defined (otherwise my system's libc isn't sensitive enough to catch it). 5.12.4 doesn't crash\, but 5.14.2\, 5.15.3 and 5.15.6 do. Here's a stacktrace for perl 5.14.2 :
$ gdb \-\-args perl5\.14\.2\-dbg\-psn\-thr\-64 x\.pl GNU gdb \(Gentoo 7\.4 p1\) 7\.4 Copyright \(C\) 2012 Free Software Foundation\, Inc\. License GPLv3\+​: GNU GPL version 3 or later
\<http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY\, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". For bug reporting instructions\, please see: \<http://bugs.gentoo.org/>... Reading symbols from /home/vince/perl/builds/bin/perl5.14.2-dbg-psn-thr-64...done. (gdb) r Starting program: /home/vince/perl/builds/bin/perl5.14.2-dbg-psn-thr-64 x.pl [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV\, Segmentation fault\. 0x00000000004d7e84 in Perl\_pad\_free \(my\_perl=0xa86010\, po=11354992\) at pad\.c​:1498 1498 if \(PL\_curpad\[po\] && PL\_curpad\[po\] \!= &PL\_sv\_undef\) \{ \(gdb\) bt \#0 0x00000000004d7e84 in Perl\_pad\_free \(my\_perl=0xa86010\,
po=11354992) at pad.c:1498 #1 0x000000000041dff2 in Perl_op_clear (my_perl=0xa86010\, o=0xab8aa0) at op.c:713 #2 0x000000000041d9d9 in Perl_op_free (my_perl=0xa86010\, o=0xab8aa0) at op.c:528 #3 0x00000000004d02a1 in Perl_yyparse (my_perl=0xa86010\, gramtype=258) at perly.c:678 #4 0x00000000004529aa in S_parse_body (my_perl=0xa86010\, env=0x0\, xsinit=0x41cf02 \<xs_init>) at perl.c:2194 #5 0x0000000000450a30 in perl_parse (my_perl=0xa86010\, xsinit=0x41cf02 \<xs_init>\, argc=2\, argv=0x7fffffffde88\, env=0x0) at perl.c:1613 #6 0x000000000041ce45 in main (argc=2\, argv=0x7fffffffde88\, env=0x7fffffffdea0) at perlmain.c:118
For me\, with the āmy $t : need_this;ā line deleted\, this command:
$ PERL_DESTRUCT_LEVEL=1 ../perl.git-copy/Porting/bisect.pl --target=miniperl -DDEBUGGING -Duseithreads -e '`$^X -Ilib ../foo`; warn $?; die if ($?>>8) != 255'
points to this commit:
f12005599f648e675af22dfef1047191e260bc48 is the first bad commit commit f12005599f648e675af22dfef1047191e260bc48 Author: Wolfram Humann \w\.c\.humann@​arcor\.de Date: Fri Aug 13 17:20:26 2010 -0700
make string-append on win32 100 times faster
When a string grows (e.g. gets appended to)\, perl calls sv_grow. When
sv_grow decides that the memory currently allocated to the string is
insufficient\, it calls saferealloc. Depending on whether or not perl
was compiled with 'usemymalloc' this calls realloc in either perls
internal version or on the operating system. Perl requests from
realloc just the amount of memory required for the current
operation. With 'usemymalloc' this is not a problem because it rounds
up memory allocation to a certain geometric progression anyway. When
the operating system's realloc is called\, this may or may not lead to
desaster\, depending on how it's implemented. On win32 it does lead to
desaster: when I loop 1000 times and each time append 1000 chars to an
initial string size of 10 million\, the memory grows from 10.000e6 to
10.001e6 to 10.002e6 and so on 1000 times till it ends at 11.000e6.
This is on darwin. I couldnāt reproduce in on dromedary\, hence:
That took 1710 seconds
--
Father Chrysostomos
On Fri\, Apr 06\, 2012 at 05:44:56PM -0700\, Father Chrysostomos via RT wrote:
On Fri Apr 06 13:43:56 2012\, perl@profvince.com wrote:
On 06/04/2012 22:27\, James E Keenan via RT wrote:
It appears there are two syntax errors here. If $t is a hash reference\, then there should be a comma after {size} -- not a semicolon. And '(used}' probably should be '{used}\,'.
Thank you very much. Jim Keenan
--- via perlbug: queue: perl5 status: new https://rt-archive.perl.org/perl5/Ticket/Display.html?id=112312
perl shouldn't crash\, regardless of whether the code is valid or not.
I can confirm the segfault with a perl built with PERL_POISON defined (otherwise my system's libc isn't sensitive enough to catch it). 5.12.4 doesn't crash\, but 5.14.2\, 5.15.3 and 5.15.6 do. Here's a stacktrace for perl 5.14.2 :
Can we make this a 5.16 blocker?
valgrind shows that the fault goes back as far as 5.10.0 and has been present ever since; whether it happens to segfault is just down to circumstance.
Given how long this bug has been present\, I don't think it needs to be a 5.16 blocker.
-- Hofstadter's Law: It always takes longer than you expect\, even when you take into account Hofstadter's Law.
On Sat\, Apr 07\, 2012 at 10:47:44PM +0100\, Dave Mitchell wrote:
On Fri\, Apr 06\, 2012 at 05:44:56PM -0700\, Father Chrysostomos via RT wrote:
On Fri Apr 06 13:43:56 2012\, perl@profvince.com wrote:
perl shouldn't crash\, regardless of whether the code is valid or not.
I can confirm the segfault with a perl built with PERL_POISON defined (otherwise my system's libc isn't sensitive enough to catch it). 5.12.4 doesn't crash\, but 5.14.2\, 5.15.3 and 5.15.6 do. Here's a stacktrace for perl 5.14.2 :
Can we make this a 5.16 blocker?
valgrind shows that the fault goes back as far as 5.10.0 and has been present ever since; whether it happens to segfault is just down to circumstance.
Given how long this bug has been present\, I don't think it needs to be a 5.16 blocker.
I bisected with this:
$ cat ../112312.sh #!/bin/sh
valgrind --error-exitcode=1 ./perl -Ilib \<\<'EOT' use strict; use warnings; sub meow (&); my %h; my $k;
meow { my $t : need_this; $t = { size => $h{$k}{size}; used => $h{$k}(used} }; }; EOT
ret=$? test $ret -eq 255 && exit 0 exit $ret
and got to this commit:
HEAD is now at 9a51af3 Fix a typo in Dynaloader_pm.PL. good - zero exit from ../112312.sh 0aded6e1de0ffebe70e2ec9f995c5ca8a55617d4 is the first bad commit commit 0aded6e1de0ffebe70e2ec9f995c5ca8a55617d4 Author: Dave Mitchell \davem@​fdisolutions\.com Date: Thu Jan 18 02:14:48 2007 +0000
disable parser stack cleanup on reduce croak (too fragile)
p4raw-id: //depot/perl@29866
:100644 100644 a9e569d9c9ccd42ad9241f0d6881f30607ac2c57 c8ee62ffc62dfcd4f5a7079f97775fa70562b6e8 M perly.c bisect run success That took 2216 seconds
IIRC this was the reversion of some work to deal with leaking ops\, so I went looking for whether it previously was a regression. I *think* this is the earliest commit relating to OP leaking:
commit 0539ab63267d5a989c8b513c410c39b33c15aa25 Author: Dave Mitchell \davem@​fdisolutions\.com Date: Sat May 27 00:31:33 2006 +0000
stop OPs leaking in eval "syntax error"
When bison pops states during error recovery\, any states holding
an OP would leak the OP. Create an extra YY table that tells us
which states are of type opval\, and when popping one of those\,
free the op.
p4raw-id: //depot/perl@28315
so I built its parent\, and for that valgrind shows no errors. So\, sadly\, I think that the commit 0aded6e1de0ffebe is the immediate cause of this.
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
Nicholas Clark
On Sun\, Apr 08\, 2012 at 11:31:37AM +0100\, Nicholas Clark wrote:
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
Ah yes\, *that* quagmire. Anyway\, thanks for bisecting this. It may be that my disabling of the experimental anti-leaking code just didn't quite disable enough.
-- "Do not dabble in paradox\, Edward\, it puts you in danger of fortuitous wit." -- Lady Croom\, "Arcadia"
On Sun\, Apr 08\, 2012 at 11:42:06AM +0100\, Dave Mitchell wrote:
On Sun\, Apr 08\, 2012 at 11:31:37AM +0100\, Nicholas Clark wrote:
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
Ah yes\, *that* quagmire. Anyway\, thanks for bisecting this.
No problem. I'm waiting for the HP-UX box to build things.
I was also wondering if it would be simple enough to add a --valgrind option to the bisect thingy to make this fall-off-a-log easy for anyone to do in future (ie valgrind --error-exitcode=1 ./perl ...). *But* the use case here was syntax checking\, which that seems to be something we're going to need to test again\, and as one can see from the structure of the shell script\, it's not as simple as I'd hoped. A failure exit code from valgrind is a failure\, whereas a failure exit code passed through from the perl interpreter (because valgrind found no errors) is a pass.
$ cat ../112312.sh #!/bin/sh
valgrind --error-exitcode=1 ./perl -Ilib \<\<'EOT' use strict; use warnings; sub meow (&); my %h; my $k;
meow { my $t : need_this; $t = { size => $h{$k}{size}; used => $h{$k}(used} }; }; EOT
ret=$? test $ret -eq 255 && exit 0 exit $ret
So I'll do something else for a bit\, to see if inspiration attacks. (Or maybe lunch will attack first.)
Nicholas Clark
On Sun Apr 08 03:32:21 2012\, nicholas wrote:
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
The simplest way might be to create something like the mortalsā stack\, but for OPs. Or maybe a mortalop hash.
Code that could croak can do the equivalent of SAVEFREEOP\, and then delete the op from the mortalop stack when everything is safe.
Would that be as fast as a tortoise\, or slower?
Or maybe a suggestion I had earlier: a variant of SAVEFREEOP that uses the savestack but returns a token (probably a stack offset) that can be used to disarm the item on the savestack and turn it into a no-op:
I32 token = SAVEFREEOP_token(o); ... do something unsafe that might croak ... DISARM_SAVESTACK(token); op_free(o);
--
Father Chrysostomos
On Tue\, Apr 24\, 2012 at 02:01:45PM -0700\, Father Chrysostomos via RT wrote:
On Sun Apr 08 03:32:21 2012\, nicholas wrote:
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
The simplest way might be to create something like the mortalsā stack\, but for OPs. Or maybe a mortalop hash.
Code that could croak can do the equivalent of SAVEFREEOP\, and then delete the op from the mortalop stack when everything is safe.
Would that be as fast as a tortoise\, or slower?
Or maybe a suggestion I had earlier: a variant of SAVEFREEOP that uses the savestack but returns a token (probably a stack offset) that can be used to disarm the item on the savestack and turn it into a no-op:
I32 token = SAVEFREEOP\_token\(o\); \.\.\. do something unsafe that might croak \.\.\. DISARM\_SAVESTACK\(token\); op\_free\(o\);
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
-- "Procrastination grows to fill the available time" -- Mitchell's corollary to Parkinson's Law
On Wed Apr 25 03:38:30 2012\, davem wrote:
On Tue\, Apr 24\, 2012 at 02:01:45PM -0700\, Father Chrysostomos via RT wrote:
On Sun Apr 08 03:32:21 2012\, nicholas wrote:
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
The simplest way might be to create something like the mortalsā stack\, but for OPs. Or maybe a mortalop hash.
Code that could croak can do the equivalent of SAVEFREEOP\, and then delete the op from the mortalop stack when everything is safe.
Would that be as fast as a tortoise\, or slower?
Or maybe a suggestion I had earlier: a variant of SAVEFREEOP that uses the savestack but returns a token (probably a stack offset) that can be used to disarm the item on the savestack and turn it into a no-op:
I32 token = SAVEFREEOP\_token\(o\); \.\.\. do something unsafe that might croak \.\.\. DISARM\_SAVESTACK\(token\); op\_free\(o\);
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
I sort of understand that in theory\, but I donāt understand it well enough to feel confident about implementing it.
--
Father Chrysostomos
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
What exactly is that code at the top of op.c that is compiled only when PL_OP_SLAB_ALLOC is defined?
--
Father Chrysostomos
On Thu\, May 17\, 2012 at 10:02:39AM -0700\, Father Chrysostomos via RT wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
What exactly is that code at the top of op.c that is compiled only when PL_OP_SLAB_ALLOC is defined?
It's Nick Ing-Simmons's "Experimental" slab allocator for op from 1999. Its never normally used\, apart from\, apparently\, when PERL_IMPLICIT_SYS is defined.
I suspect it would need heavy reworking to make it into a 'one pool per CV and throw the whole thing away on error' system.
commit b7dc083c47d05133e90d62e8b587c747dab89267 Author: Nick Ing-Simmons \nik@​tiuk\.ti\.com AuthorDate: Fri May 14 21:04:22 1999 +0000 Commit: Nick Ing-Simmons \nik@​tiuk\.ti\.com CommitDate: Fri May 14 21:04:22 1999 +0000
Experimental "slab" allocator for ops. To try it -DPL_OP_SLAB_ALLOC for op.c This is for proof of concept only\, it leaks memory (ops are not free'd) so don't use in embedded apps. If this minimalist version does not show performance gain then whole idea is worthless. Nick see's approx 12% speed up vs perlmalloc running perl -Ilib -MCPAN -e '' Solaris2.6\, gcc-2.8.1 but numbers are not repeatable.
-- Nothing ventured\, nothing lost.
On Sun May 20 01:34:06 2012\, davem wrote:
On Thu\, May 17\, 2012 at 10:02:39AM -0700\, Father Chrysostomos via RT wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
What exactly is that code at the top of op.c that is compiled only when PL_OP_SLAB_ALLOC is defined?
It's Nick Ing-Simmons's "Experimental" slab allocator for op from 1999. Its never normally used\, apart from\, apparently\, when PERL_IMPLICIT_SYS is defined.
I suspect it would need heavy reworking to make it into a 'one pool per CV and throw the whole thing away on error' system.
So basically I can just throw the whole thing away and start from scratch? :-)
Anyway this āslab allocationā is not something Iāve ever done before (my C experience being limited to what Iāve done with perl).
I *think* you mean something like this:
Every CV can point to a slab\, which is allocated much like HvARRAY\, except it can never be reallocked\, because there are pointers into it.
The beginning of the slab contains a pointer to the next slab\, and so on\, so we never run out.
Freeing a CV consists of calling op_free on every element of each slab and calling Safefree or PerlMemShared_free (what is the difference between these two sets of memory functions?) on each slab at the end.
Is that how slabs work\, more or less?
What do we do about different op types? Do we allocate separate slabs for each op type? Do we just use the largest and hope the extra padding that small op structs get isnāt too much of a waste? Do we allocate a slab with different parts of the slab set aside for different op sizes (and flags at the beginning of the slab to indicate how many of each there are)?
One way to do separate slabs would be to put a flag at the beginning of each slab to say what it holds\, and then just chain them all together.
What should be the default slab size? 64 ops? That seems a bit small for\, say\, DBI\, JE\, or Parse::RecDescent\, but big for people who like lots of tiny subroutines. However\, itās probably a good compromise.
Does sizeof(struct op) in C return the padded or unpadded size of the struct in octets?
To avoid making the xpvcv struct any bigger for XSUBs\, we could point xcv_root to the first slab. Would that break anything?
Alternatively\, we could make sure that the root is the first op in the first slab\, and then use pointer arithmetic xcv_root to get to the beginning of the slab.
--
Father Chrysostomos
On Fri Jun 08 22:39:12 2012\, sprout wrote:
So basically I can just throw the whole thing away and start from scratch? :-)
Anyway this āslab allocationā is not something Iāve ever done before (my C experience being limited to what Iāve done with perl).
I *think* you mean something like this:
Every CV can point to a slab\, which is allocated much like HvARRAY\, except it can never be reallocked\, because there are pointers into it.
The beginning of the slab contains a pointer to the next slab\, and so on\, so we never run out.
Freeing a CV consists of calling op_free on every element of each slab and calling Safefree or PerlMemShared_free (what is the difference between these two sets of memory functions?) on each slab at the end.
Is that how slabs work\, more or less?
What do we do about different op types? Do we allocate separate slabs for each op type? Do we just use the largest and hope the extra padding that small op structs get isnāt too much of a waste? Do we allocate a slab with different parts of the slab set aside for different op sizes (and flags at the beginning of the slab to indicate how many of each there are)?
One way to do separate slabs would be to put a flag at the beginning of each slab to say what it holds\, and then just chain them all together.
What should be the default slab size? 64 ops? That seems a bit small for\, say\, DBI\, JE\, or Parse::RecDescent\, but big for people who like lots of tiny subroutines. However\, itās probably a good compromise.
Does sizeof(struct op) in C return the padded or unpadded size of the struct in octets?
To avoid making the xpvcv struct any bigger for XSUBs\, we could point xcv_root to the first slab. Would that break anything?
Alternatively\, we could make sure that the root is the first op in the first slab\, and then use pointer arithmetic xcv_root to get to the beginning of the slab.
I am jumping into this ticket blindly. You bring up the issue of what is "typical" perl usage and what ops are most important (I know pp_hot is an attempt at sorting them). That question is still in unanswered in perltodo. Each malloc block has a header\, thats 2 to 6 pointers of memory depending on OS/C lib. From looking at op.h\, all of the op structs end in pointers excecpt for BASEOP\, I think BASEOP is a multiple of 32 bits\, and gets padding on 64 bits. So I presume all the op structs are a multiple of a pointer in size due to compiler alignment. I GUESS (i'm jumping in here) the ops are made by the parser as the perl text is processed. The op structs can be placed sequentially in memory I guess. To deal with how to free the op struct blocks\, 1st choice is a double linked list header on each op struct blocks for the current compiling context or CV or eval scope or whatever. The linked list is gone down to free the blocks. Another choice 1 block per CV/whatever\, when overfilled\, realloc and move the op and fixup the pointers\, whether to make the realloc amount a % of existing size or a fix amount IDK. Or get rid of OP *s and use relative offsets for related op structs that an op struct must link to so reallocs are cheaper. Where to store the base pointer\, IDK. Another idea is small multibit bitfield that specifies the offset or index from the current op struct to its mem block header. Another way to free the blocks is the save stack. Someone will argue for perl to implement its own memory allocator\, it must request entire whole pages from the OS to be memory efficient\, not large malloc blocks that contain malloc headers and speculative realloc space after them.
On Fri Jun 08 23:29:45 2012\, bulk88. wrote:
On Fri Jun 08 22:39:12 2012\, sprout wrote:
So basically I can just throw the whole thing away and start from scratch? :-)
Anyway this āslab allocationā is not something Iāve ever done before (my C experience being limited to what Iāve done with perl).
I *think* you mean something like this:
Every CV can point to a slab\, which is allocated much like HvARRAY\, except it can never be reallocked\, because there are pointers into it.
The beginning of the slab contains a pointer to the next slab\, and so on\, so we never run out.
Freeing a CV consists of calling op_free on every element of each slab and calling Safefree or PerlMemShared_free (what is the difference between these two sets of memory functions?) on each slab at the end.
Is that how slabs work\, more or less?
What do we do about different op types? Do we allocate separate slabs for each op type? Do we just use the largest and hope the extra padding that small op structs get isnāt too much of a waste? Do we allocate a slab with different parts of the slab set aside for different op sizes (and flags at the beginning of the slab to indicate how many of each there are)?
One way to do separate slabs would be to put a flag at the beginning of each slab to say what it holds\, and then just chain them all together.
What should be the default slab size? 64 ops? That seems a bit small for\, say\, DBI\, JE\, or Parse::RecDescent\, but big for people who like lots of tiny subroutines. However\, itās probably a good compromise.
Does sizeof(struct op) in C return the padded or unpadded size of the struct in octets?
To avoid making the xpvcv struct any bigger for XSUBs\, we could point xcv_root to the first slab. Would that break anything?
Alternatively\, we could make sure that the root is the first op in the first slab\, and then use pointer arithmetic xcv_root to get to the beginning of the slab.
I am jumping into this ticket blindly. You bring up the issue of what is "typical" perl usage and what ops are most important (I know pp_hot is an attempt at sorting them). That question is still in unanswered in perltodo. Each malloc block has a header\, thats 2 to 6 pointers of memory depending on OS/C lib. From looking at op.h\, all of the op structs end in pointers excecpt for BASEOP\, I think BASEOP is a multiple of 32 bits\, and gets padding on 64 bits. So I presume all the op structs are a multiple of a pointer in size due to compiler alignment. I GUESS (i'm jumping in here) the ops are made by the parser as the perl text is processed.
Yes\, thatās true\, more or less.
The op structs can be placed sequentially in memory I guess.
Thatās what I was suggesting when I mentioned HvARRAY\, but I wasnāt clear at all. And HvARRAY is a little different\, too.
To deal with how to free the op struct blocks\, 1st choice is a double linked list header on each op struct blocks for the current compiling context or CV or eval scope or whatever. The linked list is gone down to free the blocks.
Thatās what I had in mind.
Another choice 1 block per CV/whatever\, when overfilled\, realloc and move the op and fixup the pointers\, whether to make the realloc amount a % of existing size or a fix amount IDK.
The complexity makes me shudder. That would be hard to get right.
Or get rid of OP *s and use relative offsets for related op structs that an op struct must link to so reallocs are cheaper. Where to store the base pointer\, IDK.
That would require rewriting a lot of code\, and breaking some CPAN modules.
Another idea is small multibit bitfield that specifies the offset or index from the current op struct to its mem block header. Another way to free the blocks is the save stack.
I suggested using the savestack to free individual ops\, but Dave Mitchell pointed out that less code would have to change with slab/block allocation.
As for freeing slabs/blocks via the savestack\, Iām not sure how that would work. If the slabs are attached to the CV\, then they will be freed indirectly via the savestack when there are compilation errors.
Someone will argue for perl to implement its own memory allocator\, it must request entire whole pages from the OS to be memory efficient\, not large malloc blocks that contain malloc headers and speculative realloc space after them.
Thatās a separate issue altogether. On Unix\, heavy use of malloc doesnāt suffer any performance penalty. On Windows\, my understanding is that realloc is something to be avoided. Nicholas Clark mentioned using malloc.c (perlās own malloc implementation\, which can be enabled via -Dusemymalloc) but having it use Windows malloc instead of sbrk\, which would solve the efficiency problems. I have no intention of doing Windows-specific stuff\, though.
--
Father Chrysostomos
On Sat Jun 09 18:58:17 2012\, sprout wrote:>
Another idea is small multibit bitfield that specifies the offset or index from the current op struct to its mem block header. Another way to free the blocks is the save stack.
I suggested using the savestack to free individual ops\, but Dave Mitchell pointed out that less code would have to change with slab/block allocation. There are free bits in BASEOP.
Someone will argue for perl to implement its own memory allocator\, it must request entire whole pages from the OS to be memory efficient\, not large malloc blocks that contain malloc headers and speculative realloc space after them.
Thatās a separate issue altogether. On Unix\, heavy use of malloc doesnāt suffer any performance penalty. On Windows\, my understanding is that realloc is something to be avoided. Nicholas Clark mentioned using malloc.c (perlās own malloc implementation\, which can be enabled via -Dusemymalloc) but having it use Windows malloc instead of sbrk\, which would solve the efficiency problems. I have no intention of doing Windows-specific stuff\, though.
From reading how sbrk works\, in unix all user mode non executable space is one linear continuous uninterrupted block\, so it only grows or shrinks\, there is no concept of allocations and pointers to allocations from the paging system of the OS\, right? It also seems to me that on unix it would nearly impossible to shrink the data segment for the process due to fragmentation. So creating a cross platform memory allocator for Perl memory allocations API is impossible or just not useful?
From reading cygwin's docs\, they apparently use a system wide limit of 384 MB per process that sbrk on cygwin can allocate (http://www.perlmonks.org/?node_id=541750). A system wide setting can increase that. I assume cygwin "reserves" but doesn't "allocate" that 384 MB range using windows VM system to emulate sbrk.
If you include mmap\, from it man page\, its sounds identical to Window's virtual memory allocator\, and a cross platform allocator for allocators internal API in perl is very easy\, possibly as easy as a large macro. I don't know how vm allocation works on all the other platforms Perl runs on\, as a last resort\, the allocator for allocators can be redirected to malloc. Malloc.c seems to have been written around using sbrk\, and I couldn't find any code in it that will ever do a release to the OS using sbrk or brk.
On Sun Jun 10 08:30:28 2012\, bulk88. wrote:
On Sat Jun 09 18:58:17 2012\, sprout wrote:>
Another idea is small multibit bitfield that specifies the offset or index from the current op struct to its mem block header. Another way to free the blocks is the save stack.
I suggested using the savestack to free individual ops\, but Dave Mitchell pointed out that less code would have to change with slab/block allocation. There are free bits in BASEOP.
Someone will argue for perl to implement its own memory allocator\, it must request entire whole pages from the OS to be memory efficient\, not large malloc blocks that contain malloc headers and speculative realloc space after them.
Thatās a separate issue altogether. On Unix\, heavy use of malloc doesnāt suffer any performance penalty. On Windows\, my understanding is that realloc is something to be avoided. Nicholas Clark mentioned using malloc.c (perlās own malloc implementation\, which can be enabled via -Dusemymalloc) but having it use Windows malloc instead of sbrk\, which would solve the efficiency problems. I have no intention of doing Windows-specific stuff\, though.
From reading how sbrk works\, in unix all user mode non executable space is one linear continuous uninterrupted block\, so it only grows or shrinks\, there is no concept of allocations and pointers to allocations from the paging system of the OS\, right? It also seems to me that on unix it would nearly impossible to shrink the data segment for the process due to fragmentation. So creating a cross platform memory allocator for Perl memory allocations API is impossible or just not useful?
From reading cygwin's docs\, they apparently use a system wide limit of 384 MB per process that sbrk on cygwin can allocate (http://www.perlmonks.org/?node_id=541750). A system wide setting can increase that. I assume cygwin "reserves" but doesn't "allocate" that 384 MB range using windows VM system to emulate sbrk.
If you include mmap\, from it man page\, its sounds identical to Window's virtual memory allocator\, and a cross platform allocator for allocators internal API in perl is very easy\, possibly as easy as a large macro. I don't know how vm allocation works on all the other platforms Perl runs on\, as a last resort\, the allocator for allocators can be redirected to malloc. Malloc.c seems to have been written around using sbrk\, and I couldn't find any code in it that will ever do a release to the OS using sbrk or brk.
This is getting way out of my comfort zone. I donāt know enough about this to contribute any more to this aspect of the thread.
--
Father Chrysostomos
On Sun\, Apr 8\, 2012 at 6:05 AM\, Nicholas Clark \nick@​ccl4\.org wrote:
On Sun\, Apr 08\, 2012 at 11:42:06AM +0100\, Dave Mitchell wrote:
On Sun\, Apr 08\, 2012 at 11:31:37AM +0100\, Nicholas Clark wrote:
But\, I'm suspecting\, that the only *real* fix to all of this mess is to garbage collect the OPs\, in some fashion.
Ah yes\, *that* quagmire. Anyway\, thanks for bisecting this.
No problem. I'm waiting for the HP-UX box to build things.
I was also wondering if it would be simple enough to add a --valgrind option to the bisect thingy to make this fall-off-a-log easy for anyone to do in future (ie valgrind --error-exitcode=1 ./perl ...).
I suggest to rather use clang -faddress-sanitizer as it is much faster\, does not need such a hack and detects many more such errors than valgrind.
Similar errors are in various CPAN modules also. -- Reini Urban http://cpanel.net/ Ā http://www.perl-compiler.org/
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
And the slab/pool belonging to the sub is freed when the sub is freed.
What happens to the ops attached to the regexp returned by sub { qr/(?{})/ }?
What is the value of PL_compcv when regular expressions are compiled? Does each qr// or m// with code blocks get its own compcv?
Do run-time code blocks get their own PL_compcv?
--
Father Chrysostomos
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
You mean something like this attachment?
--
Father Chrysostomos
From 14e817cdd7be799d37dc309a74b7c0da97fefba2 Mon Sep 17 00:00:00 2001 From: Father Chrysostomos \sprout@​cpan\.org Date: Fri\, 22 Jun 2012 18:30:48 -0700 Subject: [PATCH] CV-based slab allocation for ops MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
This addresses bugs #111462 and #112312 and part of #107000.
When a longjmp occurs during lexing\, parsing or compilation\, any ops in C autos that are not referenced anywhere are leaked.
This commit introduces op slabs that are attached to the currently- compiling CV. New ops are allocated on the slab. When an error occurs and the CV is freed\, any ops remaining are freed.
This is based on Nick Ing-Simmonsā old experimental op slab implemen- tation\, but it had to be rewritten to work this way.
The old slab allocator has a pointer before each op that points to a reference count stored at the beginning of the slab. Freed ops are never reused. When the last op on a slab is freed\, the slab itself is freed. When a slab fills up\, a new one is created.
To allow iteration through the slab to free everything\, I had to have two pointers; one points to the next item (op slot); the other points to the slab\, for accessing the reference count. Ops come in different sizes\, so adding sizeof(OP) to a pointer wonāt work.
The old slab allocator puts the ops at the end of the slab first\, the idea being that the leaves are allocated first\, so the order will be cache-friendly as a result. I have preserved that order for a dif- ferent reason: We donāt need to store the size of the slab (slabs vary in size; see below) if we can simply follow pointers to find the last op.
I tried eliminating reference counts altogether\, by having all ops implicitly attached to PL_compcv when allocated and freed when the CV is freed. That also allowed op_free to skip FreeOp altogether\, free- ing ops faster. But that doesnāt work in those cases where ops need to survive beyond their CVs; e.g.\, re-evals.
The CV also has to have a reference count on the slab. Sometimes the first op created is immediately freed. If the reference count of the slab reaches 0\, then it will be freed with the CV still point- ing to it.
CVs use the new CVf_SLABBED flag to indicate that the CV has a refer- ence count on the slab. When this flag is set\, the slab is accessible via CvSTART when CvROOT is not set\, or by subtracting two pointers (2*sizeof(I32 *)) from CvROOT when it is set. I decided to sneak the slab into CvSTART during compilation\, because enlarging the xpvcv struct by another pointer would make all CVs larger\, even though this patch only benefits few (programs using string eval).
When the CVf_SLABBED flag is set\, the CV takes responsibility for freeing the slab. If CvROOT is not set when the CV is freed or undeffed\, it is assumed that a compilation error has occurred\, so the op slab is traversed and all the ops are freed.
Under normal circumstances\, the CV forgets about its slab (decrement- ing the reference count) when the root is attached. So the slab ref- erence counting that happens when ops are freed takes care of free- ing the slab. In some cases\, the CV is told to forget about the slab (cv_forget_slab) precisely so that the ops can survive after the CV is done away with.
Forgetting the slab when the root is attached is not strictly neces- sary\, but avoids potential problems with CvROOT being written over. There is code all over the place\, both in core and on CPAN\, that does things with CvROOT\, so forgetting the slab makes things more robust and avoids potential problems.
Since the CV takes ownership of its slab when flagged\, that flag is never copied when a CV is cloned\, as one CV could free a slab that another CV still points to\, since forced freeing of ops ignores the reference count (but asserts that it looks right).
To avoid slab fragmentation\, freed ops are marked as freed and attached to the slabās freed chain (an idea stolen from DBM::Deep). Those freed ops are reused when possible. I did consider not reusing freed ops\, but realised that would result in significantly higher mem- ory using for programs with large āif (DEBUG) {...}ā blocks.
SAVEFREEOP was slightly problematic. Sometimes it can cause an op to be freed after its CV. If the CV has forcibly freed the ops on its slab and the slab itself\, then we will be fiddling with a freed slab. Making SAVEFREEOP a no-op wonāt help\, as sometimes an op can be savefreed when there is no compilation error\, so the op would never be freed. It holds a reference count on the slab\, so the whole slab would leak. So SAVEFREEOP now sets a special flag on the op (->op_savefree). The forced freeing of ops after a compilation error wonāt free any ops thus marked.
Since many pieces of code create tiny subroutines consisting of only a few ops\, and since a huge slab would be quite a bit of baggage for those to carry around\, the first slab is always very small. To avoid allocating too many slabs for a single CV\, each subsequent slab is twice the size of the previous.
Smartmatch expects to be able to allocate an op at run time\, run it\, and then throw it away. For that to work the op is simply mallocked when PL_compcv hasāt been set up. So all slab-allocated ops are marked as such (->op_slabbed)\, to distinguish them from mallocked ops.
All of this is kept under lock and key via #ifdef PERL_CORE\, as it should be completely transparent. If it isnāt\, I would consider that a bug.
I have left the old slab allocator (PL_OP_SLAB_ALLOC) in place\, as it is used by PERL_DEBUG_READONLY_OPS\, which I am not about to rewrite. :-)
On Fri Jun 22 18:31:51 2012\, sprout wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
You mean something like this attachment?
Iāve broken it into a few commits and pushed it to the smoke-me/slop branch. It still contains a megapatch though\, because most of it is interdependent.
--
Father Chrysostomos
On Fri\, Jun 22\, 2012 at 06:31:52PM -0700\, Father Chrysostomos via RT wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
You mean something like this attachment?
yes\, thanks :-)
From a cursory read of the commit message\, it looks good. The only thing that stood out for me was:
I tried eliminating reference counts altogether\, by having all ops implicitly attached to PL_compcv when allocated and freed when the CV is freed. That also allowed op_free to skip FreeOp altogether\, free- ing ops faster. But that doesnāt work in those cases where ops need to survive beyond their CVs; e.g.\, re-evals.
IIRC\, all OPs allocated for /(?{})/ code blocks are now firmly owned by a CV:
1 for literal matches\, /(?{})/\, they are in the CV containing the match; 2 for literal qr\, qr/(?{})/\, they are stored in an anon CV which is attached to the regex\, and cloned each time the qr// is run; 3 for run-time code\, the pattern is wrapped in a qr// and reparsed\, so (2) applies. 4 when a qr// is interpolated into another pattern\, e.g $r = qr/(?{})/; /a-$r/\, then the new regex contains both pointers to the ops within the (?{})\, but also a pointer to the CV those ops are embedded in: so they won't outlive the CV.
-- More than any other time in history\, mankind faces a crossroads. One path leads to despair and utter hopelessness. The other\, to total extinction. Let us pray we have the wisdom to choose correctly. -- Woody Allen
On Mon Jun 25 04:56:58 2012\, davem wrote:
On Fri\, Jun 22\, 2012 at 06:31:52PM -0700\, Father Chrysostomos via RT wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
You mean something like this attachment?
yes\, thanks :-)
From a cursory read of the commit message\, it looks good. The only thing that stood out for me was:
I tried eliminating reference counts altogether\, by having all ops implicitly attached to PL_compcv when allocated and freed when the CV is freed. That also allowed op_free to skip FreeOp altogether\, free- ing ops faster. But that doesnāt work in those cases where ops need to survive beyond their CVs; e.g.\, re-evals.
IIRC\, all OPs allocated for /(?{})/ code blocks are now firmly owned by a CV:
1 for literal matches\, /(?{})/\, they are in the CV containing the match; 2 for literal qr\, qr/(?{})/\, they are stored in an anon CV which is attached to the regex\, and cloned each time the qr// is run; 3 for run-time code\, the pattern is wrapped in a qr// and reparsed\, so (2) applies. 4 when a qr// is interpolated into another pattern\, e.g $r = qr/(?{})/; /a-$r/\, then the new regex contains both pointers to the ops within the (?{})\, but also a pointer to the CV those ops are embedded in: so they won't outlive the CV.
The ops may all be attached to CVs\, but I know that sometimes the op that the CV is finally attached to is not the same one that was PL_compcv when the op was created.
Stepping through the debugger while working on it\, I found out this:
The PMFUNC branch of the term rule in perly.y calls start_subparse. Then a const op is created in toke.c to hold the pattern (I donāt remember exactly where)\, and then op.c:pmruntime is called\, hence this hunk:
@@ -4373\,6 +4579\,10 @@ Perl_pmruntime(pTHX_ OP *o\, OP *expr\, bool isreg\, I32 floor) * confident that nothing used that CV's pad while the * regex was parsed */ assert(AvFILLp(PL_comppad) == 0); /* just @_ */ +#ifndef PL_OP_SLAB_ALLOC + /* But we know that one op is using this CV's slab. */ + cv_forget_slab(PL_compcv); +#endif LEAVE_SCOPE(floor); pm->op_pmflags &= ~PMf_HAS_CV; }
--
Father Chrysostomos
On Mon\, Jun 25\, 2012 at 08:20:27AM -0700\, Father Chrysostomos via RT wrote:
The ops may all be attached to CVs\, but I know that sometimes the op that the CV is finally attached to is not the same one that was PL_compcv when the op was created.
Stepping through the debugger while working on it\, I found out this:
The PMFUNC branch of the term rule in perly.y calls start_subparse. Then a const op is created in toke.c to hold the pattern (I donāt remember exactly where)\, and then op.c:pmruntime is called\, hence this hunk:
@@ -4373\,6 +4579\,10 @@ Perl_pmruntime(pTHX_ OP *o\, OP *expr\, bool isreg\, I32 floor) * confident that nothing used that CV's pad while the * regex was parsed */ assert(AvFILLp(PL_comppad) == 0); /* just @_ */ +#ifndef PL_OP_SLAB_ALLOC + /* But we know that one op is using this CV's slab. */ + cv_forget_slab(PL_compcv); +#endif LEAVE_SCOPE(floor); pm->op_pmflags &= ~PMf_HAS_CV; }
I'm confused. My understand of that code path is that toke.c creates a PMOP (using the "main" PL_compcv); *then* start_subparse() is called (changing PL_compcv)\, *then* pmruntime() runs the "whoops\, guessed wrong" code and frees the inner PL_compcv. I don't see any ops being created between the start_subparse and the pmruntime ???
-- Never do today what you can put off till tomorrow.
On Mon Jun 25 09:31:07 2012\, davem wrote:
On Mon\, Jun 25\, 2012 at 08:20:27AM -0700\, Father Chrysostomos via RT wrote:
The ops may all be attached to CVs\, but I know that sometimes the op that the CV is finally attached to is not the same one that was PL_compcv when the op was created.
Stepping through the debugger while working on it\, I found out this:
The PMFUNC branch of the term rule in perly.y calls start_subparse. Then a const op is created in toke.c to hold the pattern (I donāt remember exactly where)\, and then op.c:pmruntime is called\, hence this hunk:
@@ -4373\,6 +4579\,10 @@ Perl_pmruntime(pTHX_ OP *o\, OP *expr\, bool isreg\, I32 floor) * confident that nothing used that CV's pad while the * regex was parsed */ assert(AvFILLp(PL_comppad) == 0); /* just @_ */ +#ifndef PL_OP_SLAB_ALLOC + /* But we know that one op is using this CV's slab. */ + cv_forget_slab(PL_compcv); +#endif LEAVE_SCOPE(floor); pm->op_pmflags &= ~PMf_HAS_CV; }
I'm confused. My understand of that code path is that toke.c creates a PMOP (using the "main" PL_compcv); *then* start_subparse() is called (changing PL_compcv)\, *then* pmruntime() runs the "whoops\, guessed wrong" code and frees the inner PL_compcv. I don't see any ops being created between the start_subparse and the pmruntime ???
Yacc confuses me\, too. I can never figure out what order things are going to happen. But look at this gdb session (using the smoke-me/slop branch). An op is allocated between the calls to start_subparse and pmruntime. In particular\, this message comes from the op allocated in between (-DS output):
allocating op at 305b64\, slab 305a80 at -e line 1.
The CV discarded in pmruntime has the same slab address (itās stored in CvSTART\, aka ((XPVCV*)PL_compcv->sv_any)->xcv_start_u.xcv_start).
$ gdb --args ./miniperl -DS -e 'qr/(?#(?{)/' GNU gdb 6.3.50-20050815 (Apple version gdb-1469) (Wed May 5 04:30:06 UTC 2010) Copyright 2004 Free Software Foundation\, Inc. GDB is free software\, covered by the GNU General Public License\, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries .... done
(gdb) break Perl_start_subparse Breakpoint 1 at 0x42d4f: file toke.c\, line 10759. (gdb) break Perl_pmruntime Breakpoint 2 at 0x2ed48: file op.c\, line 4474. (gdb) break Perl_Slab_Alloc Breakpoint 3 at 0x132b5: file op.c\, line 331. (gdb) run Starting program: /Users/sprout/Perl/perl.git-copy/miniperl -DS -e qr/\(\?\#\(\?\{\)/ Reading symbols for shared libraries +++. done
Breakpoint 3\, Perl_Slab_Alloc (sz=48) at op.c:331 331 if (!PL_compcv || CvROOT(PL_compcv) (gdb) c Continuing. Current language: auto; currently c++ allocating op at 30595c\, slab 305890 at -e line 1.
Breakpoint 1\, Perl_start_subparse (is_format=0\, flags=128) at toke.c:10759
10759 const I32 oldsavestack_ix = PL_savestack_ix;
(gdb) up
#1 0x00073de7 in Perl_yyparse (gramtype=258) at perly.y:1266
1266 $\
Breakpoint 3\, Perl_Slab_Alloc (sz=24) at op.c:331 331 if (!PL_compcv || CvROOT(PL_compcv) (gdb) bt #0 Perl_Slab_Alloc (sz=24) at op.c:331 #1 0x0001a167 in Perl_newSVOP (type=5\, flags=0\, sv=0x8222f0) at op.c:4847 #2 0x000560d5 in S_scan_const (start=0x305840 "(?#(?{)") at toke.c:3578 #3 0x0005b572 in Perl_yylex () at toke.c:4743 #4 0x00070f05 in Perl_yyparse (gramtype=258) at perly.c:430 #5 0x0000d3a1 in S_parse_body (env=0x0\, xsinit=0x30740 \<_ZL7xs_initv>) at perl.c:2256 #6 0x0000e479 in perl_parse (my_perl=0x300190\, xsinit=0x30740 \<_ZL7xs_initv>\, argc=4\, argv=0xbffff830\, env=0x0) at perl.c:1643 #7 0x000307e7 in main (argc=4\, argv=0xbffff830\, env=0xbffff844) at miniperlmain.c:117 (gdb) c Continuing. allocating op at 305b64\, slab 305a80 at -e line 1.
Breakpoint 2\, Perl_pmruntime (o=0x30595c\, expr=0x305b64\, isreg=true\, floor=38) at op.c:4474 4474 bool is_trans = (o->op_type == OP_TRANS || o->op_type == OP_TRANSR); (gdb) clear Perl_Slab_Alloc Deleted breakpoint 3 (gdb) n 4482 if (is_trans || o->op_type == OP_SUBST) { (gdb) 4504 return pmtrans(o\, expr\, repl); (gdb) 4482 if (is_trans || o->op_type == OP_SUBST) { (gdb) 4515 if (expr->op_type == OP_LIST) { (gdb) 4527 else if (expr->op_type != OP_CONST) (gdb) 4530 LINKLIST(expr); (gdb) s 4534 if (expr->op_type == OP_LIST) { (gdb) 4571 PL_hints |= HINT_BLOCK_SCOPE; (gdb) 4573 assert(floor==0 || (pm->op_pmflags & PMf_HAS_CV)); (gdb) 4575 if (is_compiletime) { (gdb) 4576 U32 rx_flags = pm->op_pmflags & RXf_PMf_COMPILETIME; (gdb) 4577 regexp_engine const *eng = current_re_engine(); (gdb) n 4580 rx_flags |= RXf_SPLIT; (gdb) 4582 if (!has_code || !eng->op_comp) { (gdb) 4585 if ((pm->op_pmflags & PMf_HAS_CV) && !has_code) { (gdb) 4591 assert(AvFILLp(PL_comppad) == 0); /* just @_ */ (gdb) 4594 cv_forget_slab(PL_compcv); (gdb) p ((XPVCV*)PL_compcv->sv_any)->xcv_start_u.xcv_start $2 = (OP *) 0x305a80
--
Father Chrysostomos
On Mon\, Jun 25\, 2012 at 11:09:50AM -0700\, Father Chrysostomos via RT wrote:
Breakpoint 1\, Perl_start_subparse (is_format=0\, flags=128) at toke.c:10759 10759 const I32 oldsavestack_ix = PL_savestack_ix; (gdb) up
Breakpoint 3\, Perl_Slab_Alloc (sz=24) at op.c:331 331 if (!PL_compcv || CvROOT(PL_compcv) (gdb) bt #0 Perl_Slab_Alloc (sz=24) at op.c:331 #1 0x0001a167 in Perl_newSVOP (type=5\, flags=0\, sv=0x8222f0) at op.c:4847 #2 0x000560d5 in S_scan_const (start=0x305840 "(?#(?{)") at toke.c:3578
Ah\, *that* const op ;-) Somehow I missed triggering an op alloc breakpoint when I tried it earlier.
In which case\, as regards my code\, yuck! That "we guessed we had a code block but it turns out we didn't" bit of code was always a bit of hack\, and now that I realise it leaves an op allocated in the wrong CV\, I like it even less.
I'm tempted to eliminate it altogether. Would doing this enable you to simplify the slab code?
-- But Pity stayed his hand. "It's a pity I've run out of bullets"\, he thought. -- "Bored of the Rings"
On Mon Jun 25 14:41:06 2012\, davem wrote:
On Mon\, Jun 25\, 2012 at 11:09:50AM -0700\, Father Chrysostomos via RT wrote:
Breakpoint 1\, Perl_start_subparse (is_format=0\, flags=128) at toke.c:10759 10759 const I32 oldsavestack_ix = PL_savestack_ix; (gdb) up
Breakpoint 3\, Perl_Slab_Alloc (sz=24) at op.c:331 331 if (!PL_compcv || CvROOT(PL_compcv) (gdb) bt #0 Perl_Slab_Alloc (sz=24) at op.c:331 #1 0x0001a167 in Perl_newSVOP (type=5\, flags=0\, sv=0x8222f0) at op.c:4847 #2 0x000560d5 in S_scan_const (start=0x305840 "(?#(?{)") at toke.c:3578
Ah\, *that* const op ;-) Somehow I missed triggering an op alloc breakpoint when I tried it earlier.
In which case\, as regards my code\, yuck! That "we guessed we had a code block but it turns out we didn't" bit of code was always a bit of hack\, and now that I realise it leaves an op allocated in the wrong CV\, I like it even less.
I'm tempted to eliminate it altogether. Would doing this enable you to simplify the slab code?
No\, because I still have to take SAVEFREEOP into account. :-) I could fiddle to get savestack items the right order\, but what I have currently is far more robust than the alternative.
The three things I didnāt have working with my earlier (non-refcounted) system were: ā¢ smartmatch ā¢ SAVEFREEOP - I just made it a no-op to get tests passing\, which leaked ops when there were no errors ā¢ re-evals
smartmatch is solved by using malloc.
SAVEFREEOP is solved using the refcounting system. That solves re-evals āfor freeā\, except for the one cv_forget_slab call in pmruntime.
--
Father Chrysostomos
On Sat Jun 23 16:32:20 2012\, sprout wrote:
On Fri Jun 22 18:31:51 2012\, sprout wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
You mean something like this attachment?
Iāve broken it into a few commits and pushed it to the smoke-me/slop branch. It still contains a megapatch though\, because most of it is interdependent.
After two weeks writing the initial patch and another week tweaking and testing it\, Iāve finally merged it as c5fb998.
I just had another look at 8be227ab5e\, which is the main part of it\, and I think thatās the longest commit message Iāve written!
Itās probably also my greenest patch.
--
Father Chrysostomos
On Sat Jun 23 16:32:20 2012\, sprout wrote:
On Fri Jun 22 18:31:51 2012\, sprout wrote:
On Wed Apr 25 03:38:30 2012\, davem wrote:
I think another suggestion that was mooted a while ago would be to allocate OPs from a pool or slab\, with a new pool/slab started each time we start compiling a new sub\, and the pool in some way marked as complete at the end of compiling the sub. On croaking\, all the OPs in the unfinished pools are freed. That way most code doesn't need to be modified.
You mean something like this attachment?
Iāve broken it into a few commits and pushed it to the smoke-me/slop branch. It still contains a megapatch though\, because most of it is interdependent.
After two weeks writing the initial patch and another week tweaking and testing it\, Iāve finally merged it as c5fb998.
I just had another look at 8be227ab5e\, which is the main part of it\, and I think thatās the longest commit message Iāve written!
Itās probably also my greenest patch.
--
Father Chrysostomos
@cpansprout - Status changed from 'open' to 'resolved'
On Mon Jun 25 14:50:38 2012\, sprout wrote:
On Mon Jun 25 14:41:06 2012\, davem wrote:
That "we guessed we had a code block but it turns out we didn't" bit of code was always a bit of hack\, and now that I realise it leaves an op allocated in the wrong CV\, I like it even less.
I'm tempted to eliminate it altogether. Would doing this enable you to simplify the slab code?
No\, because I still have to take SAVEFREEOP into account. :-) I could fiddle to get savestack items the right order\, but what I have currently is far more robust than the alternative.
The three things I didnāt have working with my earlier (non-refcounted) system were: ā¢ smartmatch ā¢ SAVEFREEOP - I just made it a no-op to get tests passing\, which leaked ops when there were no errors ā¢ re-evals
Attached is an early diff containing the alternative mentioned above\, which I am attaching here for posterity.
This was before the re-eval rewrite was merged\, before newSTUB\, and before I had thought of the CVf_SLABBED flag. The corresponding workarounds are a twisted maze. The only advantage was that freeing a slab was faster\, but probably less robust\, in that some ops might not be cleared and no check was done.
--
Father Chrysostomos
On Mon Jun 25 14:50:38 2012\, sprout wrote:
On Mon Jun 25 14:41:06 2012\, davem wrote:
That "we guessed we had a code block but it turns out we didn't" bit of code was always a bit of hack\, and now that I realise it leaves an op allocated in the wrong CV\, I like it even less.
I'm tempted to eliminate it altogether. Would doing this enable you to simplify the slab code?
No\, because I still have to take SAVEFREEOP into account. :-) I could fiddle to get savestack items the right order\, but what I have currently is far more robust than the alternative.
The three things I didnāt have working with my earlier (non-refcounted) system were: ā¢ smartmatch ā¢ SAVEFREEOP - I just made it a no-op to get tests passing\, which leaked ops when there were no errors ā¢ re-evals
Attached is an early diff containing the alternative mentioned above\, which I am attaching here for posterity.
This was before the re-eval rewrite was merged\, before newSTUB\, and before I had thought of the CVf_SLABBED flag. The corresponding workarounds are a twisted maze. The only advantage was that freeing a slab was faster\, but probably less robust\, in that some ops might not be cleared and no check was done.
--
Father Chrysostomos
Migrated from rt.perl.org#112312 (status was 'resolved')
Searchable as RT112312$