Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.99k stars 559 forks source link

Memory leak with regex in 5.10.0 #9504

Closed p5pRT closed 16 years ago

p5pRT commented 16 years ago

Migrated from rt.perl.org#59516 (status was 'resolved')

Searchable as RT59516$

p5pRT commented 16 years ago

From robin.hill@biowisdom.com

Created by robin.hill@biowisdom.com

This is a bug report for perl from robin.hill@​biowisdom.com\, generated with the help of perlbug 1.36 running under perl 5.10.0.

----------------------------------------------------------------- I've been having problems with a script consuming all memory on the system and have tracked this down to the regex. The problem only seems to occur with a combination of quoted variables and singular character classes.

The following example script steadily increases in memory usage while running​:

######################################################### #!/usr/bin/perl -w

use strict; use warnings; use Time​::HiRes qw(usleep);

my $text = 'Test string';

for my $str (1..10000) {   my ($res) = $text =~ /\Q$str\E[a][b][c][d][e][f]/;   usleep(5); } #########################################################

Changing the character classes to include more than one character appears to eliminate the leak. (In the actual script I'm trying to check for brackets and following the recommendation of using singular character classes instead of escaping the metacharacter).

Perl Info ``` Flags: category=core severity=medium This perlbug was built using Perl 5.10.0 - Tue Jul 15 14:37:49 UTC 2008 It is being executed now by Perl 5.10.0 - Tue Jul 15 14:31:57 UTC 2008. Site configuration information for perl 5.10.0: Configured by abuild at Tue Jul 15 14:31:57 UTC 2008. Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=linux, osvers=2.6.25, archname=x86_64-linux-thread-multi uname='linux stravinsky 2.6.25 #1 smp 20080210 20:01:04 utc x86_64 x86_64 x86_64 gnulinux ' config_args='-ds -e -Dprefix=/usr -Dvendorprefix=/usr -Dinstallusrbinperl -Dusethreads -Di_db -Di_dbm -Di_ndbm -Di_gdbm -Duseshrplib=true -Doptimize=-O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -g -Wall -pipe -Accflags=-DPERL_USE_SAFE_PUTENV' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DPERL_USE_SAFE_PUTENV -DDEBUGGING -fno-strict-aliasing -pipe -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -g -Wall -pipe', cppflags='-D_REENTRANT -D_GNU_SOURCE -DPERL_USE_SAFE_PUTENV -DDEBUGGING -fno-strict-aliasing -pipe' ccversion='', gccversion='4.3.1 20080507 (prerelease) [gcc-4_3-branch revision 135036]', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib64' libpth=/lib64 /usr/lib64 /usr/local/lib64 libs=-lm -ldl -lcrypt -lpthread perllibs=-lm -ldl -lcrypt -lpthread libc=/lib64/libc-2.8.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.8' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.10.0/x86_64-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib64' Locally applied patches: @INC for perl 5.10.0: /home/hillrobi/svn/perl_scripts /home/hillrobi/svn/perl_scripts /home/hillrobi/svn/perl_scripts /usr/lib/perl5/5.10.0/x86_64-linux-thread-multi /usr/lib/perl5/5.10.0 /usr/lib/perl5/site_perl/5.10.0/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.10.0 /usr/lib/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.10.0 /usr/lib/perl5/vendor_perl . Environment for perl 5.10.0: HOME=/home/hillrobi LANG=en_GB.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH=/opt/oracle/OraHome1/lib:/opt/oracle/OraHome1/ctx/lib:/opt/oracle/OraHome1/lib32 LOGDIR (unset) PATH=/home/hillrobi/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin:/opt/oracle/OraHome1/bin:/usr/local/bin:/home/hillrobi/bin PERL5LIB=/home/hillrobi/svn/perl_scripts:/home/hillrobi/svn/perl_scripts:/home/hillrobi/svn/perl_scripts PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 16 years ago

From @iabyn

On Wed\, Oct 01\, 2008 at 05​:51​:41AM -0700\, robin.hill@​biowisdom.com (via RT) wrote​:

The following example script steadily increases in memory usage while running​:

######################################################### #!/usr/bin/perl -w

use strict; use warnings; use Time​::HiRes qw(usleep);

my $text = 'Test string';

for my $str (1..10000) { my ($res) = $text =~ /\Q$str\E[a][b][c][d][e][f]/; usleep(5); } #########################################################

Changing the character classes to include more than one character appears to eliminate the leak.

The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8​:

  while (1) {   qr/[a]/;   }

-- The Enterprise successfully ferries an alien VIP from one place to another without serious incident.   -- Things That Never Happen in "Star Trek" #7

p5pRT commented 16 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 16 years ago

From @nwc10

On Sat\, Oct 04\, 2008 at 12​:55​:04PM +0100\, Dave Mitchell wrote​:

The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8​:

while \(1\) \{
qr/\[a\]/;
\}

Are you sure that it's character classes?

./perl -le 'while (1) { qr// }'

merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program​:

$ PERL_DESTRUCT_LEVEL=2 valgrind ./perl -le 'for (1..100000) { qr/[a]/ }' ==31194== Memcheck\, a memory error detector. ==31194== Copyright (C) 2002-2006\, and GNU GPL'd\, by Julian Seward et al. ==31194== Using LibVEX rev 1658\, a library for dynamic binary translation. ==31194== Copyright (C) 2004-2006\, and GNU GPL'd\, by OpenWorks LLP. ==31194== Using valgrind-3.2.1-Debian\, a dynamic binary instrumentation framework. ==31194== Copyright (C) 2000-2006\, and GNU GPL'd\, by Julian Seward et al. ==31194== For more details\, rerun with​: -v ==31194== ==31194== ==31194== ERROR SUMMARY​: 0 errors from 0 contexts (suppressed​: 8 from 1) ==31194== malloc/free​: in use at exit​: 0 bytes in 0 blocks. ==31194== malloc/free​: 101\,653 allocs\, 101\,653 frees\, 4\,948\,035 bytes allocated. ==31194== For counts of detected errors\, rerun with​: -v ==31194== All heap blocks were freed -- no leaks are possible.

Which makes me think that the problem is somewhere in how regexps are allocated (which differs between 5.10.x and 5.11\, but both seem to exhibit the same problem. This seems to be consistent with a bug report that the reference count of regexps is 1 too high in 5.11 (where they are now first class SVs))

Nicholas Clark

p5pRT commented 16 years ago

From @iabyn

On Sat\, Oct 04\, 2008 at 02​:51​:09PM +0100\, Nicholas Clark wrote​:

On Sat\, Oct 04\, 2008 at 12​:55​:04PM +0100\, Dave Mitchell wrote​:

The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8​:

while \(1\) \{
qr/\[a\]/;
\}

Are you sure that it's character classes?

./perl -le 'while (1) { qr// }'

merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program​:

Hmm\, maybe I reduced the original code too much\, and threw out the original bug but gained a new one.

This doesn't leak​:

  my $n = 1;   while (1) {   $n = 1 - $n;   "abc" =~ /$n/;   }

This does​:

  my $n = 1;   while (1) {   $n = 1 - $n;   "abc" =~ /[a]${n}/;   }

(The non-constant $n is to defeat regex compilation caching).

-- Fire extinguisher (n) a device for holding open fire doors.

p5pRT commented 16 years ago

From @nwc10

On Sat\, Oct 04\, 2008 at 03​:02​:32PM +0100\, Dave Mitchell wrote​:

On Sat\, Oct 04\, 2008 at 02​:51​:09PM +0100\, Nicholas Clark wrote​:

Are you sure that it's character classes?

./perl -le 'while (1) { qr// }'

merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program​:

Hmm\, maybe I reduced the original code too much\, and threw out the original bug but gained a new one.

This doesn't leak​:

my $n = 1;
while \(1\) \{
$n = 1 \- $n;
"abc" =~ /$n/;
\}

This does​:

my $n = 1;
while \(1\) \{
$n = 1 \- $n;
"abc" =~ /\[a\]$\{n\}/;
\}

(The non-constant $n is to defeat regex compilation caching).

Hmm\, but that one still cleans up after itself​:

$ cat 59516 my $n = 1; for (1..10000) {   $n = 1 - $n;   "abc" =~ /[a]${n}/; } $ PERL_DESTRUCT_LEVEL=1 valgrind ./perl 59516 ==31497== Memcheck\, a memory error detector. ==31497== Copyright (C) 2002-2006\, and GNU GPL'd\, by Julian Seward et al. ==31497== Using LibVEX rev 1658\, a library for dynamic binary translation. ==31497== Copyright (C) 2004-2006\, and GNU GPL'd\, by OpenWorks LLP. ==31497== Using valgrind-3.2.1-Debian\, a dynamic binary instrumentation framework. ==31497== Copyright (C) 2000-2006\, and GNU GPL'd\, by Julian Seward et al. ==31497== For more details\, rerun with​: -v ==31497== ==31497== ==31497== ERROR SUMMARY​: 0 errors from 0 contexts (suppressed​: 8 from 1) ==31497== malloc/free​: in use at exit​: 0 bytes in 0 blocks. ==31497== malloc/free​: 100\,770 allocs\, 100\,770 frees\, 4\,650\,887 bytes allocated. ==31497== For counts of detected errors\, rerun with​: -v ==31497== All heap blocks were freed -- no leaks are possible.

(the while loop version still munches away though)

Nicholas Clark

p5pRT commented 16 years ago

From @mhx

On 2008-10-04\, at 14​:51​:09 +0100\, Nicholas Clark wrote​:

On Sat\, Oct 04\, 2008 at 12​:55​:04PM +0100\, Dave Mitchell wrote​:

The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8​:

while \(1\) \{
qr/\[a\]/;
\}

Are you sure that it's character classes?

./perl -le 'while (1) { qr// }'

Fixed with the following change​:

Change 34506 by mhx@​mhx-r2d2 on 2008/10/18 18​:04​:40

  Fix memory leak in qr// operator. This was most probably   introduced with #30849.

Affected files ...

... //depot/perl/pp_hot.c#578 edit

Differences ...

==== //depot/perl/pp_hot.c#578 (text) ====

@​@​ -1212\,6 +1212\,7 @​@​

  if (pkg) {   HV* const stash = gv_stashpv(SvPV_nolen(pkg)\, GV_ADD); + SvREFCNT_dec(pkg);   (void)sv_bless(rv\, stash);   }

merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program​:

-- The world is moving so fast these days that the man who says it can't be done is generally interrupted by someone doing it.   -- E. Hubbard

p5pRT commented 16 years ago

From @mhx

On 2008-10-04\, at 15​:02​:32 +0100\, Dave Mitchell wrote​:

On Sat\, Oct 04\, 2008 at 02​:51​:09PM +0100\, Nicholas Clark wrote​:

On Sat\, Oct 04\, 2008 at 12​:55​:04PM +0100\, Dave Mitchell wrote​:

The leak appears to be somewhere in the compilation of character classes. The following code leaks like a sieve on 5.10.0 and bleed\, but not 5.8.8​:

while \(1\) \{
qr/\[a\]/;
\}

Are you sure that it's character classes?

./perl -le 'while (1) { qr// }'

merrily chews through memory like it's going out of fashion. However\, all memory is freed at the end of the program​:

Hmm\, maybe I reduced the original code too much\, and threw out the original bug but gained a new one.

This doesn't leak​:

my $n = 1;
while \(1\) \{
$n = 1 \- $n;
"abc" =~ /$n/;
\}

This does​:

my $n = 1;
while \(1\) \{
$n = 1 \- $n;
"abc" =~ /\[a\]$\{n\}/;
\}

(The non-constant $n is to defeat regex compilation caching).

Fixed with the following change​:

Change 34507 by mhx@​mhx-r2d2 on 2008/10/18 18​:11​:57

  Fix memory leak in // caused by single-char character class   optimization. This was most probably introduced with #28262.   This change fixes perl #59516.

Affected files ...

... //depot/perl/regcomp.c#660 edit

Differences ...

==== //depot/perl/regcomp.c#660 (text) ====

@​@​ -8350\,6 +8350\,9 @​@​   *STRING(ret)= (char)value;   STR_LEN(ret)= 1;   RExC_emit += STR_SZ(1); + if (listsv) { + SvREFCNT_dec(listsv); + }   return ret;   }   /* optimize case-insensitive simple patterns (e.g. /[a-z]/i) */

-- Langsam's Laws​:   (1) Everything depends.   (2) Nothing is always.   (3) Everything is sometimes.

p5pRT commented 16 years ago

From @mhx

Fixed in bleadperl by change #34507.

p5pRT commented 16 years ago

@mhx - Status changed from 'open' to 'resolved'