Perl / perl5

šŸŖ The Perl programming language
https://dev.perl.org/perl5/
Other
1.93k stars 551 forks source link

compcv refcount (again) #14553

Open p5pRT opened 9 years ago

p5pRT commented 9 years ago

Migrated from rt.perl.org#123963 (status was 'open')

Searchable as RT123963$

p5pRT commented 9 years ago

From @hvds

They're getting harder to find\, or at least to minimize. :)

AFL (\<http​://lcamtuf.coredump.cx/afl/>) finds this​:

% perl -e 'print "m\x{0}\x{1}\x{0}0000@​\x{0}\x{13}\x{ff}000000\x{1}\x{0}"' | ./miniperl -c miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff' failed. Aborted (core dumped) %

This looks like another compcv refcount issue​:

(gdb) where #0 0x00007ffff72edbb9 in __GI_raise (sig=sig@​entry=6)   at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56 #1 0x00007ffff72f0fc8 in __GI_abort () at abort.c​:89 #2 0x00007ffff72e6a76 in __assert_fail_base (   fmt=0x7ffff7438370 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n"\,   assertion=assertion@​entry=0x7b4980 "((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff"\, file=file@​entry=0x7a1134 "sv.c"\, line=line@​entry=6537\,   function=function@​entry=0x7bdb15 \<__PRETTY_FUNCTION__.17333> "Perl_sv_clear") at assert.c​:92 #3 0x00007ffff72e6b22 in __GI___assert_fail (   assertion=0x7b4980 "((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff"\,   file=0x7a1134 "sv.c"\, line=6537\,   function=0x7bdb15 \<__PRETTY_FUNCTION__.17333> "Perl_sv_clear")   at assert.c​:101 #4 0x00000000005c2c4d in Perl_sv_clear (orig_sv=0xa262c8) at sv.c​:6537 #5 0x00000000005c5af6 in Perl_sv_free2 (sv=0xa262c8\, rc=1) at sv.c​:7031 #6 0x00000000004b343c in S_SvREFCNT_dec (sv=0xa262c8) at inline.h​:166 #7 0x00000000004b82f3 in Perl_yyparse (gramtype=258) at perly.c​:423 #8 0x0000000000408c19 in S_parse_body (env=0x0\, xsinit=0x4509d7 \<xs_init>)   at perl.c​:2277 #9 0x00000000004078b5 in perl_parse (my_perl=0xa24010\,   xsinit=0x4509d7 \<xs_init>\, argc=3\, argv=0x7fffffffe638\, env=0x0)   at perl.c​:1611 #10 0x0000000000450939 in main (argc=3\, argv=0x7fffffffe638\,   env=0x7fffffffe658) at miniperlmain.c​:120 (gdb)

Hugo

p5pRT commented 9 years ago

From @cpansprout

On Mon Mar 02 00​:50​:54 2015\, hv wrote​:

They're getting harder to find\, or at least to minimize. :)

AFL (\<http​://lcamtuf.coredump.cx/afl/>) finds this​:

% perl -e 'print "m\x{0}\x{1}\x{0}0000@​\x{0}\x{13}\x{ff}000000\x{1}\x{0}"' | ./miniperl -c miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed. Aborted (core dumped) %

$ perl -CS -e 'print "use utf8; m|\@​\x{ff13}|"' | ./miniperl -Ilib -c Assertion failed​: (SvTYPE(sv) != (svtype)SVTYPEMASK)\, function Perl_sv_clear\, file sv.c\, line 6537. Abort trap​: 6

--

Father Chrysostomos

p5pRT commented 9 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 9 years ago

From @hvds

I've restarted AFL with a new build of blead at 923ed5809c\, which is running way cleaner since (I guess) the [perl #123802] issues were nailed.

It has reported only one crash so far\, and I suspect it's another instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40 80 65 a0 a1 0a}' | ./miniperl -c miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff' failed. Aborted (core dumped) %

.. with a similar backtrace.

Hugo

p5pRT commented 9 years ago

From @cpansprout

On Mon Mar 02 16​:05​:03 2015\, hv wrote​:

I've restarted AFL with a new build of blead at 923ed5809c\, which is running way cleaner since (I guess) the [perl #123802] issues were nailed.

It has reported only one crash so far\, and I suspect it's another instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40 80 65 a0 a1 0a}' | ./miniperl -c miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed. Aborted (core dumped) %

.. with a similar backtrace.

Hugo

In both cases we have a regular expression containing an @​ followed by a non-ASCII digit.

A bisect points to f25ce8440.

f25ce84407dda38dcbb46145067fe57d29d1ef7c is the first bad commit commit f25ce84407dda38dcbb46145067fe57d29d1ef7c Author​: Karl Williamson \public@&#8203;khwilliamson\.com Date​: Mon Jan 6 12​:22​:02 2014 -0700

  isWORDCHAR_uni()\, isDIGIT_utf8() etc no longer go out to disk  
  Previous commits in this series have caused all the POSIX classes to be   completely specified at C compile time. This allows us to revise the   base function used by all these macros to use these definitions\,   avoiding reading them in from disk.

-DT output shows that the lexer emits @​ \, THING instead of @​ WORD.

The resulting failure mode is similar to #123802 (with LEX_INTERPSTART this time)\, though I donā€™t know that there is a way to trigger it that does not depend on this Unicode bug.

--

Father Chrysostomos

p5pRT commented 9 years ago

From @cpansprout

On Mon Mar 02 17​:50​:47 2015\, sprout wrote​:

On Mon Mar 02 16​:05​:03 2015\, hv wrote​:

I've restarted AFL with a new build of blead at 923ed5809c\, which is running way cleaner since (I guess) the [perl #123802] issues were nailed.

It has reported only one crash so far\, and I suspect it's another instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40 80 65 a0 a1 0a}' | ./miniperl -c miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed. Aborted (core dumped) %

.. with a similar backtrace.

Hugo

In both cases we have a regular expression containing an @​ followed by a non-ASCII digit.

A bisect points to f25ce8440.

f25ce84407dda38dcbb46145067fe57d29d1ef7c is the first bad commit commit f25ce84407dda38dcbb46145067fe57d29d1ef7c Author​: Karl Williamson \public@&#8203;khwilliamson\.com Date​: Mon Jan 6 12​:22​:02 2014 -0700

isWORDCHAR_uni()\, isDIGIT_utf8() etc no longer go out to disk

Previous commits in this series have caused all the POSIX classes to be completely specified at C compile time. This allows us to revise the base function used by all these macros to use these definitions\, avoiding reading them in from disk.

-DT output shows that the lexer emits @​ \, THING instead of @​ WORD.

The resulting failure mode is similar to #123802 (with LEX_INTERPSTART this time)\, though I donā€™t know that there is a way to trigger it that does not depend on this Unicode bug.

scan_const sees what looks like an @​ followed by a valid identifier\, so it stops before the @​. Shortly thereafter\, parse_ident rejects the fullwidth 3 (\x{ff13}) as not being valid at the beginning of an identifier. So the lexer is not consistent with itself.

scan_const is using isWORDCHAR_lazy_if. parse_ident is using isIDFIRST_utf8. Did f25ce84407d indirectly change the behaviour of either of those?

Since this is a regression\, I think we ought to address it before 5.22.

--

Father Chrysostomos

p5pRT commented 9 years ago

From @khwilliamson

On 03/02/2015 10​:50 PM\, Father Chrysostomos via RT wrote​:

On Mon Mar 02 17​:50​:47 2015\, sprout wrote​:

On Mon Mar 02 16​:05​:03 2015\, hv wrote​:

I've restarted AFL with a new build of blead at 923ed5809c\, which is running way cleaner since (I guess) the [perl #123802] issues were nailed.

It has reported only one crash so far\, and I suspect it's another instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40 80 65 a0 a1 0a}' | ./miniperl -c miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed. Aborted (core dumped) %

.. with a similar backtrace.

Hugo

In both cases we have a regular expression containing an @​ followed by a non-ASCII digit.

A bisect points to f25ce8440.

f25ce84407dda38dcbb46145067fe57d29d1ef7c is the first bad commit commit f25ce84407dda38dcbb46145067fe57d29d1ef7c Author​: Karl Williamson \public@&#8203;khwilliamson\.com Date​: Mon Jan 6 12​:22​:02 2014 -0700

isWORDCHAR_uni()\, isDIGIT_utf8() etc no longer go out to disk

Previous commits in this series have caused all the POSIX classes to be completely specified at C compile time. This allows us to revise the base function used by all these macros to use these definitions\, avoiding reading them in from disk.

-DT output shows that the lexer emits @​ \, THING instead of @​ WORD.

The resulting failure mode is similar to #123802 (with LEX_INTERPSTART this time)\, though I donā€™t know that there is a way to trigger it that does not depend on this Unicode bug.

scan_const sees what looks like an @​ followed by a valid identifier\, so it stops before the @​. Shortly thereafter\, parse_ident rejects the fullwidth 3 (\x{ff13}) as not being valid at the beginning of an identifier. So the lexer is not consistent with itself.

I think we should do an audit of the uses of these. In this case\, WORD is inappropriate. My guess is that there are other instances of the same\, which only bite us on unaccustomed Unicode characters.

scan_const is using isWORDCHAR_lazy_if. parse_ident is using isIDFIRST_utf8. Did f25ce84407d indirectly change the behaviour of either of those?

Looking at the commit\, the behavior change was inadvertent

Since this is a regression\, I think we ought to address it before 5.22.

Agreed\, and I see you have already added it to the blockers.

p5pRT commented 9 years ago

From @cpansprout

On Fri Mar 06 10​:38​:31 2015\, public@​khwilliamson.com wrote​:

On 03/02/2015 10​:50 PM\, Father Chrysostomos via RT wrote​:

scan_const sees what looks like an @​ followed by a valid identifier\, so it stops before the @​. Shortly thereafter\, parse_ident rejects the fullwidth 3 (\x{ff13}) as not being valid at the beginning of an identifier. So the lexer is not consistent with itself.

I think we should do an audit of the uses of these. In this case\, WORD is inappropriate. My guess is that there are other instances of the same\, which only bite us on unaccustomed Unicode characters.

I have fixed this particular case in 9d58dbc. I have not done the audit\, so this ticket should stay open\, but I am removing it from the blockersā€™ list.

scan_const is using isWORDCHAR_lazy_if. parse_ident is using isIDFIRST_utf8. Did f25ce84407d indirectly change the behaviour of either of those?

Looking at the commit\, the behavior change was inadvertent

Since this is a regression\, I think we ought to address it before 5.22.

Agreed\, and I see you have already added it to the blockers.

--

Father Chrysostomos