Closed p5pRT closed 16 years ago
This is a bug report for perl from robin.hill@biowisdom.com\, generated with the help of perlbug 1.36 running under perl 5.10.0.
----------------------------------------------------------------- I've been having problems with a script consuming all memory on the system and have tracked this down to the regex. The problem only seems to occur with a combination of quoted variables and singular character classes.
The following example script steadily increases in memory usage while running:
######################################################### #!/usr/bin/perl -w
use strict; use warnings; use Time::HiRes qw(usleep);
my $text = 'Test string';
for my $str (1..10000) { my ($res) = $text =~ /\Q$str\E[a][b][c][d][e][f]/; usleep(5); } #########################################################
Changing the character classes to include more than one character appears to eliminate the leak. (In the actual script I'm trying to check for brackets and following the recommendation of using singular character classes instead of escaping the metacharacter).
On Wed\, Oct 01\, 2008 at 05:51:41AM -0700\, robin.hill@biowisdom.com (via RT) wrote:
The following example script steadily increases in memory usage while running:
######################################################### #!/usr/bin/perl -w
use strict; use warnings; use Time::HiRes qw(usleep);
my $text = 'Test string';
for my $str (1..10000) { my ($res) = $text =~ /\Q$str\E[a][b][c][d][e][f]/; usleep(5); } #########################################################
Changing the character classes to include more than one character appears to eliminate the leak.
The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8:
while (1) { qr/[a]/; }
-- The Enterprise successfully ferries an alien VIP from one place to another without serious incident. -- Things That Never Happen in "Star Trek" #7
The RT System itself - Status changed from 'new' to 'open'
On Sat\, Oct 04\, 2008 at 12:55:04PM +0100\, Dave Mitchell wrote:
The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8:
while \(1\) \{ qr/\[a\]/; \}
Are you sure that it's character classes?
./perl -le 'while (1) { qr// }'
merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program:
$ PERL_DESTRUCT_LEVEL=2 valgrind ./perl -le 'for (1..100000) { qr/[a]/ }' ==31194== Memcheck\, a memory error detector. ==31194== Copyright (C) 2002-2006\, and GNU GPL'd\, by Julian Seward et al. ==31194== Using LibVEX rev 1658\, a library for dynamic binary translation. ==31194== Copyright (C) 2004-2006\, and GNU GPL'd\, by OpenWorks LLP. ==31194== Using valgrind-3.2.1-Debian\, a dynamic binary instrumentation framework. ==31194== Copyright (C) 2000-2006\, and GNU GPL'd\, by Julian Seward et al. ==31194== For more details\, rerun with: -v ==31194== ==31194== ==31194== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1) ==31194== malloc/free: in use at exit: 0 bytes in 0 blocks. ==31194== malloc/free: 101\,653 allocs\, 101\,653 frees\, 4\,948\,035 bytes allocated. ==31194== For counts of detected errors\, rerun with: -v ==31194== All heap blocks were freed -- no leaks are possible.
Which makes me think that the problem is somewhere in how regexps are allocated (which differs between 5.10.x and 5.11\, but both seem to exhibit the same problem. This seems to be consistent with a bug report that the reference count of regexps is 1 too high in 5.11 (where they are now first class SVs))
Nicholas Clark
On Sat\, Oct 04\, 2008 at 02:51:09PM +0100\, Nicholas Clark wrote:
On Sat\, Oct 04\, 2008 at 12:55:04PM +0100\, Dave Mitchell wrote:
The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8:
while \(1\) \{ qr/\[a\]/; \}
Are you sure that it's character classes?
./perl -le 'while (1) { qr// }'
merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program:
Hmm\, maybe I reduced the original code too much\, and threw out the original bug but gained a new one.
This doesn't leak:
my $n = 1; while (1) { $n = 1 - $n; "abc" =~ /$n/; }
This does:
my $n = 1; while (1) { $n = 1 - $n; "abc" =~ /[a]${n}/; }
(The non-constant $n is to defeat regex compilation caching).
-- Fire extinguisher (n) a device for holding open fire doors.
On Sat\, Oct 04\, 2008 at 03:02:32PM +0100\, Dave Mitchell wrote:
On Sat\, Oct 04\, 2008 at 02:51:09PM +0100\, Nicholas Clark wrote:
Are you sure that it's character classes?
./perl -le 'while (1) { qr// }'
merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program:
Hmm\, maybe I reduced the original code too much\, and threw out the original bug but gained a new one.
This doesn't leak:
my $n = 1; while \(1\) \{ $n = 1 \- $n; "abc" =~ /$n/; \}
This does:
my $n = 1; while \(1\) \{ $n = 1 \- $n; "abc" =~ /\[a\]$\{n\}/; \}
(The non-constant $n is to defeat regex compilation caching).
Hmm\, but that one still cleans up after itself:
$ cat 59516 my $n = 1; for (1..10000) { $n = 1 - $n; "abc" =~ /[a]${n}/; } $ PERL_DESTRUCT_LEVEL=1 valgrind ./perl 59516 ==31497== Memcheck\, a memory error detector. ==31497== Copyright (C) 2002-2006\, and GNU GPL'd\, by Julian Seward et al. ==31497== Using LibVEX rev 1658\, a library for dynamic binary translation. ==31497== Copyright (C) 2004-2006\, and GNU GPL'd\, by OpenWorks LLP. ==31497== Using valgrind-3.2.1-Debian\, a dynamic binary instrumentation framework. ==31497== Copyright (C) 2000-2006\, and GNU GPL'd\, by Julian Seward et al. ==31497== For more details\, rerun with: -v ==31497== ==31497== ==31497== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1) ==31497== malloc/free: in use at exit: 0 bytes in 0 blocks. ==31497== malloc/free: 100\,770 allocs\, 100\,770 frees\, 4\,650\,887 bytes allocated. ==31497== For counts of detected errors\, rerun with: -v ==31497== All heap blocks were freed -- no leaks are possible.
(the while loop version still munches away though)
Nicholas Clark
On 2008-10-04\, at 14:51:09 +0100\, Nicholas Clark wrote:
On Sat\, Oct 04\, 2008 at 12:55:04PM +0100\, Dave Mitchell wrote:
The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8:
while \(1\) \{ qr/\[a\]/; \}
Are you sure that it's character classes?
./perl -le 'while (1) { qr// }'
Fixed with the following change:
Change 34506 by mhx@mhx-r2d2 on 2008/10/18 18:04:40
Fix memory leak in qr// operator. This was most probably introduced with #30849.
Affected files ...
... //depot/perl/pp_hot.c#578 edit
Differences ...
==== //depot/perl/pp_hot.c#578 (text) ====
@@ -1212\,6 +1212\,7 @@
if (pkg) { HV* const stash = gv_stashpv(SvPV_nolen(pkg)\, GV_ADD); + SvREFCNT_dec(pkg); (void)sv_bless(rv\, stash); }
merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program:
-- The world is moving so fast these days that the man who says it can't be done is generally interrupted by someone doing it. -- E. Hubbard
On 2008-10-04\, at 15:02:32 +0100\, Dave Mitchell wrote:
On Sat\, Oct 04\, 2008 at 02:51:09PM +0100\, Nicholas Clark wrote:
On Sat\, Oct 04\, 2008 at 12:55:04PM +0100\, Dave Mitchell wrote:
The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8:
while \(1\) \{ qr/\[a\]/; \}
Are you sure that it's character classes?
./perl -le 'while (1) { qr// }'
merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program:
Hmm\, maybe I reduced the original code too much\, and threw out the original bug but gained a new one.
This doesn't leak:
my $n = 1; while \(1\) \{ $n = 1 \- $n; "abc" =~ /$n/; \}
This does:
my $n = 1; while \(1\) \{ $n = 1 \- $n; "abc" =~ /\[a\]$\{n\}/; \}
(The non-constant $n is to defeat regex compilation caching).
Fixed with the following change:
Change 34507 by mhx@mhx-r2d2 on 2008/10/18 18:11:57
Fix memory leak in // caused by single-char character class optimization. This was most probably introduced with #28262. This change fixes perl #59516.
Affected files ...
... //depot/perl/regcomp.c#660 edit
Differences ...
==== //depot/perl/regcomp.c#660 (text) ====
@@ -8350\,6 +8350\,9 @@ *STRING(ret)= (char)value; STR_LEN(ret)= 1; RExC_emit += STR_SZ(1); + if (listsv) { + SvREFCNT_dec(listsv); + } return ret; } /* optimize case-insensitive simple patterns (e.g. /[a-z]/i) */
-- Langsam's Laws: (1) Everything depends. (2) Nothing is always. (3) Everything is sometimes.
Fixed in bleadperl by change #34507.
@mhx - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#59516 (status was 'resolved')
Searchable as RT59516$