BBC: Blead breaks MAUKE/Quote-Code and others

andk commented 2 years ago

Description

Starting with v5.37.3-256-geb54d46f72 tests started to fail for MAUKE/Quote-Code-1.0102.tar.gz

Sample fail report: http://www.cpantesters.org/cpan/report/2afa4b80-39d7-11ed-a607-5f869d66406b

Git bisect for this fail lead me to

commit eb54d46f7264ff7af62c409d8a6ab984a5a34f57
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Aug 26 18:26:14 2022 +0200

    Stop parsing on first syntax error.

Steps to Reproduce

cpan -i MAUKE/Quote-Code-1.0102.tar.gz

Expected behavior

Should build, test and install the pkg

Perl configuration

# perl -V output goes here
Summary of my perl5 (revision 5 version 37 subversion 4) configuration:
  Commit id: f2582f5b18658f945a763f2edc110cdc7c5220e7
  Platform:
    osname=linux
    osvers=5.4.0-125-generic
    archname=x86_64-linux-thread-multi
    uname='linux k93focal 5.4.0-125-generic #141-ubuntu smp wed aug 10 13:42:03 utc 2022 x86_64 x86_64 x86_64 gnulinux '
    config_args='-Dprefix=/home/sand/src/perl/repoperls/installed-perls/host/k93focal/v5.37.4/d119 -Dmyhostname=k93focal -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Dlibswanted=cl pthread socket inet nsl gdbm dbm malloc dl ld sun m crypt sec util c cposix posix ucb BSD gdbm_compat -Duseithreads -Uuselongdouble -DEBUGGING=both'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2 -g'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='9.4.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.31.so
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.31'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    DEBUGGING
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_TRACK_MEMPOOL
    PERL_USE_DEVEL
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under linux
  Compiled at Sep 21 2022 03:04:18
  %ENV:
    PERL5LIB="/tmp/loop_over_bdir-2945421-pdrHb6/ProjectBuilder-0.14.1-0/blib/arch:/tmp/loop_over_bdir-2945421-pdrHb6/ProjectBuilder-0.14.1-0/blib/lib"
    PERL5OPT=""
    PERL5_CPANPLUS_IS_RUNNING="2945431"
    PERL5_CPAN_IS_RUNNING="2945431"
    PERL_CANARY_STABILITY_NOPROMPT="1"
    PERL_MM_USE_DEFAULT="1"
    PERL_USE_UNSAFE_INC="1"
  @INC:
    /tmp/loop_over_bdir-2945421-pdrHb6/ProjectBuilder-0.14.1-0/blib/arch
    /tmp/loop_over_bdir-2945421-pdrHb6/ProjectBuilder-0.14.1-0/blib/lib
    /home/sand/src/perl/repoperls/installed-perls/host/k93focal/v5.37.4/d119/lib/site_perl/5.37.4/x86_64-linux-thread-multi
    /home/sand/src/perl/repoperls/installed-perls/host/k93focal/v5.37.4/d119/lib/site_perl/5.37.4
    /home/sand/src/perl/repoperls/installed-perls/host/k93focal/v5.37.4/d119/lib/5.37.4/x86_64-linux-thread-multi
    /home/sand/src/perl/repoperls/installed-perls/host/k93focal/v5.37.4/d119/lib/5.37.4
    .

jkeenan commented 2 years ago

Starting with v5.37.3-256-geb54d46f72 tests started to fail for MAUKE/Quote-Code-1.0102.tar.gz

Sample fail report: http://www.cpantesters.org/cpan/report/2afa4b80-39d7-11ed-a607-5f869d66406b

Git bisect for this fail lead me to
commit eb54d46f7264ff7af62c409d8a6ab984a5a34f57
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Aug 26 18:26:14 2022 +0200

    Stop parsing on first syntax error.

@demerphq, can you take a look? Thanks.

andk commented 2 years ago

Also affected: SCHWIGON/BenchmarkAnything-Schema-0.004.tar.gz Report: http://www.cpantesters.org/cpan/report/aa1399dc-3a39-11ed-89ed-9873ac66406b

andk commented 2 years ago

Also affected: MCHE/Mojolicious-Plugin-RenderCGI-0.102.tar.gz Report: http://www.cpantesters.org/cpan/report/f7a4da9a-3aec-11ed-a602-dbf5be1cab27

andk commented 2 years ago

Also affected: DROLSKY/HTML-Mason-1.59.tar.gz Report: http://www.cpantesters.org/cpan/report/3bbeb3fe-3a11-11ed-bd5b-bfada566406b

jkeenan commented 2 years ago

Also affected: DROLSKY/HTML-Mason-1.59.tar.gz Report: http://www.cpantesters.org/cpan/report/3bbeb3fe-3a11-11ed-bd5b-bfada566406b

@andk, is this the same problem as previously reported in https://github.com/Perl/perl5/issues/20291 ?

andk commented 2 years ago

@jkeenan : yes, looks like the same

andk commented 2 years ago

Also affected: MAUKE/Quote-Ref-0.03.tar.gz Report: http://www.cpantesters.org/cpan/report/19637c16-3d1a-11ed-aa73-d799bc281ded

andk commented 2 years ago

Also affected: MARCEL/Sub-Documentation-1.100880.tar.gz Report: http://www.cpantesters.org/cpan/report/825d1166-3ceb-11ed-9b23-a03301e40043

tonycoz commented 2 years ago

The Sub::Quote failure seems like a reasonable consequence of the change, before and after the change:

$ ~/perl/v5.36.0-clang14/bin/perl -e 'use warnings FATAL => "all"; use strict; eval "qc]1]"; print ">>$@<<\n"'
>>Unmatched right square bracket at (eval 1) line 1, at end of line
syntax error at (eval 1) line 1, near "qc]"
Number found where operator expected at (eval 1) line 1, near "]1"
        (Missing operator before 1?)
Unmatched right square bracket at (eval 1) line 1, at end of line
<<
tony@venus:.../git/perl5$ ./perl -Ilib -e 'use warnings FATAL => "all"; use strict; eval "qc]1]"; print ">>$@<<\n"'
>>Unmatched right square bracket at (eval 1) line 1, at end of line
syntax error at (eval 1) line 1, near "qc]"
<<

There difference here is before the change the parser would see the syntax error (the first ]) and queue it up, then the parser would generate the Number found warning, which is what the Sub::Quote test tests for.

After the change, the syntax error immediately causes the parser to abort, so the Number found warning is never generated.

Looking at the test report it looks like Quote::Ref is encountering the same problem.

tonycoz commented 2 years ago

Also affected: SCHWIGON/BenchmarkAnything-Schema-0.004.tar.gz Report: http://www.cpantesters.org/cpan/report/aa1399dc-3a39-11ed-89ed-9873ac66406b

This appears to be unrelated to the other changes here. From the test output it looked like the attempt to compile bin/benchmarkanything-schema was producing a segfault:

#   Failed test 'bin/benchmarkanything-schema compiled ok'
#   at t/00-compile.t line 65.
#          got: '139'
#     expected: '0'

Debugging it, the backtrace is in pp_padsv_store:

$ gdb --args ~/perl/blead/bin/perl -Mblib -c bin/benchmarkanything-schema
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/tony/perl/blead/bin/perl...
(gdb) r
Starting program: /home/tony/perl/blead/bin/perl -Mblib -c bin/benchmarkanything-schema
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
Perl_save_clearsv (my_perl=0x555555a8f2a0, svp=0x555555acbfe0) at scope.c:728
728         SvPADSTALE_off(*svp); /* mark lexical as active */
(gdb) bt
#0  Perl_save_clearsv (my_perl=0x555555a8f2a0, svp=0x555555acbfe0)
    at scope.c:728
#1  0x00005555556d72c9 in Perl_pp_padsv_store (my_perl=0x555555a8f2a0)
    at pp_hot.c:151
#2  0x0000555555695d4d in Perl_runops_debug (my_perl=0x555555a8f2a0)
    at dump.c:2730
#3  0x00005555555d2888 in S_run_body (my_perl=my_perl@entry=0x555555a8f2a0, 
    oldscope=oldscope@entry=1) at perl.c:2776
#4  0x00005555555d2dd2 in perl_run (my_perl=0x555555a8f2a0) at perl.c:2704
#5  0x0000555555599240 in main (argc=<optimized out>, argv=<optimized out>, 
    env=<optimized out>) at perlmain.c:107

though valgrind complains a few lines earlier:

$ valgrind -q ~/perl/blead/bin/perl -Mblib -c bin/benchmarkanything-schema
==2220920== Invalid read of size 8
==2220920==    at 0x28B21E: Perl_pp_padsv_store (pp_hot.c:143)
==2220920==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2220920==    by 0x186887: S_run_body (perl.c:2776)
==2220920==    by 0x186DD1: perl_run (perl.c:2704)
==2220920==    by 0x14D23F: main (perlmain.c:107)
==2220920==  Address 0x4c79c30 is 16 bytes before a block of size 32 free'd
==2220920==    at 0x483AD7B: realloc (vg_replace_malloc.c:834)
==2220920==    by 0x2550AE: Perl_safesysrealloc (util.c:290)
==2220920==    by 0x285297: Perl_av_extend_guts (av.c:165)
==2220920==    by 0x1E08A9: Perl_padnamelist_store (pad.c:2635)
==2220920==    by 0x1E21A1: Perl_pad_alloc (pad.c:763)
==2220920==    by 0x1540A2: Perl_op_relocate_sv (op.c:2767)
==2220920==    by 0x3497FB: S_finalize_op (peep.c:1277)
==2220920==    by 0x34C271: Perl_finalize_optree (peep.c:1177)
==2220920==    by 0x1519A9: S_process_optree (op.c:2747)
==2220920==    by 0x163DA0: Perl_newPROG (op.c:4630)
==2220920==    by 0x1D9A21: Perl_yyparse (perly.y:159)
==2220920==    by 0x185A4A: S_parse_body (perl.c:2596)
==2220920==  Block was alloc'd at
==2220920==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==2220920==    by 0x255323: Perl_safesyscalloc (util.c:467)
==2220920==    by 0x1E0568: Perl_newPADNAMELIST (pad.c:2608)
==2220920==    by 0x1E0B2E: Perl_pad_new (pad.c:241)
==2220920==    by 0x18587B: S_parse_body (perl.c:2491)
==2220920==    by 0x1865CB: perl_parse (perl.c:1905)
==2220920==    by 0x14D21E: main (perlmain.c:106)
==2220920== 
==2220920== Invalid read of size 8
==2220920==    at 0x309D6E: Perl_save_clearsv (scope.c:728)
==2220920==    by 0x28B2C8: Perl_pp_padsv_store (pp_hot.c:151)
==2220920==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2220920==    by 0x186887: S_run_body (perl.c:2776)
==2220920==    by 0x186DD1: perl_run (perl.c:2704)
==2220920==    by 0x14D23F: main (perlmain.c:107)
==2220920==  Address 0x4c79c30 is 16 bytes before a block of size 32 free'd
==2220920==    at 0x483AD7B: realloc (vg_replace_malloc.c:834)
==2220920==    by 0x2550AE: Perl_safesysrealloc (util.c:290)
==2220920==    by 0x285297: Perl_av_extend_guts (av.c:165)
==2220920==    by 0x1E08A9: Perl_padnamelist_store (pad.c:2635)
==2220920==    by 0x1E21A1: Perl_pad_alloc (pad.c:763)
==2220920==    by 0x1540A2: Perl_op_relocate_sv (op.c:2767)
==2220920==    by 0x3497FB: S_finalize_op (peep.c:1277)
==2220920==    by 0x34C271: Perl_finalize_optree (peep.c:1177)
==2220920==    by 0x1519A9: S_process_optree (op.c:2747)
==2220920==    by 0x163DA0: Perl_newPROG (op.c:4630)
==2220920==    by 0x1D9A21: Perl_yyparse (perly.y:159)
==2220920==    by 0x185A4A: S_parse_body (perl.c:2596)
==2220920==  Block was alloc'd at
==2220920==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==2220920==    by 0x255323: Perl_safesyscalloc (util.c:467)
==2220920==    by 0x1E0568: Perl_newPADNAMELIST (pad.c:2608)
==2220920==    by 0x1E0B2E: Perl_pad_new (pad.c:241)
==2220920==    by 0x18587B: S_parse_body (perl.c:2491)
==2220920==    by 0x1865CB: perl_parse (perl.c:1905)
==2220920==    by 0x14D21E: main (perlmain.c:106)
==2220920== 
==2220920== Invalid read of size 4
==2220920==    at 0x309D71: Perl_SvPADSTALE_off (sv_inline.h:732)
==2220920==    by 0x309D71: Perl_save_clearsv (scope.c:728)
==2220920==    by 0x28B2C8: Perl_pp_padsv_store (pp_hot.c:151)
==2220920==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2220920==    by 0x186887: S_run_body (perl.c:2776)
==2220920==    by 0x186DD1: perl_run (perl.c:2704)
==2220920==    by 0x14D23F: main (perlmain.c:107)
==2220920==  Address 0xc is not stack'd, malloc'd or (recently) free'd
==2220920== 
==2220920== 
==2220920== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==2220920==  Access not within mapped region at address 0xC
==2220920==    at 0x309D71: Perl_SvPADSTALE_off (sv_inline.h:732)
==2220920==    by 0x309D71: Perl_save_clearsv (scope.c:728)
==2220920==    by 0x28B2C8: Perl_pp_padsv_store (pp_hot.c:151)
==2220920==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2220920==    by 0x186887: S_run_body (perl.c:2776)
==2220920==    by 0x186DD1: perl_run (perl.c:2704)
==2220920==    by 0x14D23F: main (perlmain.c:107)
==2220920==  If you believe this happened as a result of a stack
==2220920==  overflow in your program's main thread (unlikely but
==2220920==  possible), you can try to increase the size of the
==2220920==  main thread stack using the --main-stacksize= flag.
==2220920==  The main thread stack size used in this run was 8388608.
Segmentation fault

Was this one bisected down to the syntax error reporting change?

tonycoz commented 2 years ago

Also affected: MCHE/Mojolicious-Plugin-RenderCGI-0.102.tar.gz Report: http://www.cpantesters.org/cpan/report/f7a4da9a-3aec-11ed-a602-dbf5be1cab27

Without the debugger the template _compile() method appears to be returning an exception object, but the caller in Mojolicious::Plugin::RenderCGI is treating that as success. It then proceeds to call the template _run() method that calls the subref that just failed to compile, and is undef.

I suspect since $SIG{DIE} is now being called the syntax error is being converted into an exception object which is confusing the code.

Debug output

``` err >>syntax error at (eval 104) line 7, at EOF << SV = PVIV(0x15a12f88) at 0x170d24f0 REFCNT = 1 FLAGS = (ROK) IV = 0 RV = 0x17276b50 SV = PVHV(0x151efaa8) at 0x17276b50 REFCNT = 2 FLAGS = (OBJECT,SHAREKEYS) STASH = 0x1292c330 "Mojo::Exception" ARRAY = 0x172e8258 (0:5, 1:3) hash quality = 150.0% KEYS = 3 FILL = 3 MAX = 7 Elt "frames" HASH = 0xc853a0d9 SV = IV(0x172feed0) at 0x172feee0 REFCNT = 1 FLAGS = (ROK) RV = 0x1293eb60 SV = PVAV(0x14129ee8) at 0x1293eb60 REFCNT = 1 ... ```

Running this under the debugger causes a lot of strangeness (note the panic):

tony@venus:.../build/Mojolicious-Plugin-RenderCGI-0.102-0$ ~/perl/blead/bin/perl -d -Mblib t/01-main.t

Loading DB routines from perl5db.pl version 1.76
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

Test2::API::CODE(0x5653a6a98e20)(/home/tony/perl/blead/lib/5.37.5/Test2/API.pm:72):
72:         INIT { eval 'END { test2_set_is_end() }; 1' or die $@ }
  DB<1> c
panic: freed op 0x5653a893a0d0 called
 at t/01-main.t line 0.
Debugged program terminated.  Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
S<h q>, S<h R> or S<h o> to get additional info.
  DB<1>

Under valgrind:

tony@venus:.../build/Mojolicious-Plugin-RenderCGI-0.102-0$ valgrind -q ~/perl/blead/bin/perl -d -Mblib t/01-main.t

Loading DB routines from perl5db.pl version 1.76
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

==2224148== Source and destination overlap in memcpy_chk(0x1ffeff3b60, 0x1ffeff3b65, 64)
==2224148==    at 0x48408F0: __memcpy_chk (vg_replace_strmem.c:1593)
==2224148==    by 0x4849A3B: memmove (string_fortified.h:40)
==2224148==    by 0x4849A3B: bsd_realpath (Cwd.xs:144)
==2224148==    by 0x484B22F: XS_Cwd_abs_path (Cwd.xs:614)
==2224148==    by 0x29B272: Perl_pp_entersub (pp_hot.c:5457)
==2224148==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2224148==    by 0x17CEDC: Perl_call_sv (perl.c:3117)
==2224148==    by 0x180FB7: Perl_call_list (perl.c:5188)
==2224148==    by 0x16F321: S_process_special_blocks (op.c:10844)
==2224148==    by 0x1724F8: Perl_newATTRSUB_x (op.c:10684)
==2224148==    by 0x174DCB: Perl_utilize (op.c:7617)
==2224148==    by 0x1DA0B9: Perl_yyparse (perly.y:395)
==2224148==    by 0x312907: S_doeval_compile (pp_ctl.c:3729)
==2224148== 
Test2::API::CODE(0x15bae150)(/home/tony/perl/blead/lib/5.37.5/Test2/API.pm:72):
72:         INIT { eval 'END { test2_set_is_end() }; 1' or die $@ }
  DB<1> c
==2224148== Invalid read of size 8
==2224148==    at 0x249D4A: Perl_runops_debug (dump.c:2730)
==2224148==    by 0x186887: S_run_body (perl.c:2776)
==2224148==    by 0x186DD1: perl_run (perl.c:2704)
==2224148==    by 0x14D23F: main (perlmain.c:107)
==2224148==  Address 0x170fff20 is 7,744 bytes inside a block of size 8,232 free'd
==2224148==    at 0x48399AB: free (vg_replace_malloc.c:538)
==2224148==    by 0x151F0D: Perl_opslab_free (op.c:558)
==2224148==    by 0x15214E: Perl_Slab_Free (op.c:510)
==2224148==    by 0x1530E5: Perl_op_free (op.c:970)
==2224148==    by 0x30B59B: Perl_leave_scope (scope.c:1350)
==2224148==    by 0x30FFDB: S_pop_eval_context_maybe_croak (pp_ctl.c:1699)
==2224148==    by 0x32B14B: Perl_pp_leaveeval (pp_ctl.c:4813)
==2224148==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2224148==    by 0x17CEDC: Perl_call_sv (perl.c:3117)
==2224148==    by 0x180FB7: Perl_call_list (perl.c:5188)
==2224148==    by 0x186A3B: S_run_body (perl.c:2760)
==2224148==    by 0x186DD1: perl_run (perl.c:2704)
==2224148==  Block was alloc'd at
==2224148==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==2224148==    by 0x14E7D9: S_new_slab (op.c:254)
==2224148==    by 0x151D4B: Perl_Slab_Alloc (op.c:403)
==2224148==    by 0x15374E: Perl_alloc_LOGOP (op.c:1672)
==2224148==    by 0x161A24: S_new_logop (op.c:8520)
==2224148==    by 0x161D05: Perl_newLOGOP (op.c:8284)
==2224148==    by 0x1DB1D2: Perl_yyparse (perly.y:971)
==2224148==    by 0x312907: S_doeval_compile (pp_ctl.c:3729)
==2224148==    by 0x316ED3: S_require_file (pp_ctl.c:4562)
==2224148==    by 0x317EAE: Perl_pp_require (pp_ctl.c:4594)
==2224148==    by 0x249D4C: Perl_runops_debug (dump.c:2730)
==2224148==    by 0x17CEDC: Perl_call_sv (perl.c:3117)
==2224148== 
panic: freed op 0x170fff10 called
 at t/01-main.t line 0.
Debugged program terminated.  Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
S<h q>, S<h R> or S<h o> to get additional info.

tonycoz commented 2 years ago

Also affected: MARCEL/Sub-Documentation-1.100880.tar.gz Report: http://www.cpantesters.org/cpan/report/825d1166-3ceb-11ed-9b23-a03301e40043

This one appears to be unrelated to this change, the code is failing with Modification of a read-only value attempted. and curcop points at line 0 of the source file (BenchmarkAnything::Schema has a similar curcop.

In this case the backtrace in C points to pp_padsv_store:

(gdb) b Perl_croak_no_modify 
Breakpoint 1 at 0x14df2a: file util.c, line 2082.
(gdb) r
Starting program: /home/tony/perl/blead/bin/perl -MCarp=verbose -Mblib t/01_attributes.t
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
1..1

Breakpoint 1, Perl_croak_no_modify () at util.c:2082
2082    {
(gdb) bt
#0  Perl_croak_no_modify () at util.c:2082
#1  0x00005555557059f1 in Perl_sv_force_normal_flags (my_perl=0x555555a8f2a0, 
    sv=0x555556404f98, flags=4) at sv.c:5264
#2  0x00005555556fecce in Perl_sv_setsv_flags (my_perl=0x555555a8f2a0, 
    dsv=0x555556404f98, ssv=0x555555a8f3e0, flags=1538) at sv.c:4156
#3  0x00005555556d7282 in Perl_pp_padsv_store (my_perl=0x555555a8f2a0)
    at pp_hot.c:166
#4  0x0000555555695d4d in Perl_runops_debug (my_perl=0x555555a8f2a0)
    at dump.c:2730
#5  0x00005555555d2888 in S_run_body (my_perl=0x555555a8f2a0, oldscope=1)
    at perl.c:2776
#6  0x00005555555d2dd2 in perl_run (my_perl=0x555555a8f2a0) at perl.c:2704
#7  0x0000555555599240 in main (argc=<optimized out>, argv=<optimized out>, 
    env=<optimized out>) at perlmain.c:107
(gdb) p *(my_perl->Icurcop)
$1 = {op_next = 0x0, op_sibparent = 0x0, op_ppaddr = 0x0, op_targ = 0, 
  op_type = 0, op_opt = 0, op_slabbed = 0, op_savefree = 0, op_static = 0, 
  op_folded = 0, op_moresib = 0, op_spare = 0, op_flags = 0 '\000', 
  op_private = 0 '\000', cop_line = 0, cop_stashoff = 63, 
  cop_file = 0x555555aaac80 "t/01_attributes.t", cop_hints = 256, 
  cop_seq = 8963, cop_warnings = 0x0, cop_hints_hash = 0x0, cop_features = 0}
(gdb) p my_perl->Icompiling 
$2 = {op_next = 0x0, op_sibparent = 0x0, op_ppaddr = 0x0, op_targ = 0, 
  op_type = 0, op_opt = 0, op_slabbed = 0, op_savefree = 0, op_static = 0, 
  op_folded = 0, op_moresib = 0, op_spare = 0, op_flags = 0 '\000', 
  op_private = 0 '\000', cop_line = 0, cop_stashoff = 63, 
  cop_file = 0x555555aaac80 "t/01_attributes.t", cop_hints = 256, 
  cop_seq = 8963, cop_warnings = 0x0, cop_hints_hash = 0x0, cop_features = 0}

haarg commented 2 years ago

The Sub::Quote failure seems like a reasonable consequence of the change, before and after the change:

Do you mean Quote::Code here? I don't see any reported failures with Sub::Quote, and it doesn't have any tests for a 'Number found' error.

demerphq commented 2 years ago

This thread just made me realize we have had a long standing bug in how we report the piled up errors.

Consider this code from Quote-Ref's tests:

perl -e'qwa]1]'
Number found where operator expected at -e line 1, near "]1"
    (Missing operator before 1?)
Unmatched right square bracket at -e line 1, at end of line
syntax error at -e line 1, near "qwa]"
Unmatched right square bracket at -e line 1, at end of line
Execution of -e aborted due to compilation errors.

Why is the "Number found where operator expected" error first? With stop on first syntax error (blead) we see this:

perl -e'qwa]1]'
Unmatched right square bracket at -e line 1, at end of line
syntax error at -e line 1, near "qwa]"
Execution of -e aborted due to compilation errors.

So why did the "Number found where operator expected" error end up on top?

I THOUGHT the answer is that the logic for piling up compile errors uses concatenation, adding each new error at the end. But when something triggers croak directly, and it flushes the error log, it puts the most recent error on top, not on the bottom. Making our confused stream of errors even harder to understand.

But I was wrong. Its because certain compile time warnings are generated in a way that they do not queue up.

jkeenan commented 2 years ago

This thread just made me realize we have had a long standing bug in how we report the piled up errors. [snip]

I THOUGHT the answer is that the logic for piling up compile errors uses concatenation, adding each new error at the end. But when something triggers croak directly, and it flushes the error log, it puts the most recent error on top, not on the bottom. Making our confused stream of errors even harder to understand.

But I was wrong. Its because certain compile time warnings are generated in a way that they do not queue up.

In the sentence above, did you intend to say "compile time errors? (I have assumed that everything we're talking about in this ticket are errors rather than warnings. Is that assumption correct?)

richardleach commented 2 years ago

I've noted the BenchmarkAnything::Schema and Sub::Documentation cases and will start looking into them.

tonycoz commented 2 years ago

The Sub::Quote failure seems like a reasonable consequence of the change, before and after the change:

Do you mean Quote::Code here? I don't see any reported failures with Sub::Quote, and it doesn't have any tests for a 'Number found' error.

Sorry, yes.

tonycoz commented 2 years ago

In the sentence above, did you intend to say "compile time errors? (I have assumed that everything we're talking about in this ticket are errors rather than warnings. Is that assumption correct?)

The Number found where operator expected message is generated as a warning, but in the failing case fatal warnings are enabled, converting the warning to an error.

andk commented 2 years ago

Also affected: CXW/Text-PerlPP-0.600001.tar.gz Report: http://www.cpantesters.org/cpan/report/0bb21e02-422d-11ed-b587-f85c026a3300

tonycoz commented 2 years ago

CXW/Text-PerlPP looks like the others:

   Failed test at t/07-invalid.t line 54.
#                   'syntax error at t/multiline.txt line 12, near "12!"
#   (Might be a runaway multi-line '' string starting on line 10)
# '
#     doesn't match '(?^:Number found.*line 13)'

#   Failed test at t/07-invalid.t line 54.
#                   'syntax error at <script: rerun with -E to see text> line 48, near "12!"
# '
#     doesn't match '(?^:Number found.*line 49)'

richardleach commented 2 years ago

BenchmarkAnything::Schema and Sub::Documentation breakages seem to boil down to something to do with Attribute::Handlers.

This was also the case for Eixo::Base (https://github.com/Perl/perl5/issues/20377). In that issue, breakage was bisected to https://github.com/Perl/perl5/commit/c304acb49dada68ec331d50f8af45f0dda83ba6a and reverting that commit seemed to fix Eixo::Base and Sub::Documentation. (I haven't been able to successfully build BenchmarkAnything::Schema to check any effect there.)

Please could someone confirm, just to check that it wasn't random luck when I was testing?

jkeenan commented 2 years ago

BenchmarkAnything::Schema and Sub::Documentation breakages seem to boil down to something to do with Attribute::Handlers.

This was also the case for Eixo::Base (#20377). In that issue, breakage was bisected to c304acb and reverting that commit seemed to fix Eixo::Base and Sub::Documentation. (I haven't been able to successfully build BenchmarkAnything::Schema to check any effect there.)

Please could someone confirm, just to check that it wasn't random luck when I was testing?

If what you need confirmation for is BenchmarkAnything::Schema ...

I built perls at eb54d46f7264ff7af62c409d8a6ab984a5a34f57 and at the immediately preceding commit. At the earlier commit, this module installs. At eb54d46, it does not.

richardleach commented 2 years ago

If what you need confirmation for is BenchmarkAnything::Schema ...

I built perls at eb54d46 and at the immediately preceding commit. At the earlier commit, this module installs. At eb54d46, it does not.

Thanks @jkeenan. I wasn't (and still am not) clear on how that commit would lead to the memory access errors that @tonycoz pasted, but clearly it must be doing so.

tonycoz commented 2 years ago

breakage was bisected to c304acb and

I tested reverting this commit, At 7e52ee3837b8a55c0f16b1d54a6215442f450deb t/003_sub_validation.t segfaulted:

Starting program: /home/tony/perl/blead/bin/perl5.37.5 -Mblib -I. t/003_sub_validation.t
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
ok 1 - use Eixo::Base::Util;

Program received signal SIGSEGV, Segmentation fault.
Perl_save_clearsv (svp=0x555555c14a00) at scope.c:728
728         SvPADSTALE_off(*svp); /* mark lexical as active */
(gdb) bt
#0  Perl_save_clearsv (svp=0x555555c14a00) at scope.c:728
#1  0x00005555556c7c23 in Perl_pp_padsv_store () at pp_hot.c:151
#2  0x000055555568a98a in Perl_runops_debug () at dump.c:2730
#3  0x00005555555d1dab in S_run_body (oldscope=oldscope@entry=1) at perl.c:2776
#4  0x00005555555d228a in perl_run (my_perl=<optimized out>) at perl.c:2704
#5  0x000055555559c0d3 in main (argc=<optimized out>, argv=<optimized out>, 
    env=<optimized out>) at perlmain.c:107
(gdb) p *svp
$1 = (SV *) 0x435f5849534f505f
(gdb) x/1sb svp
0x555555c14a00: "_POSIX_CHOWN_RESTRICTED"

I reverted c304acb49d (conflict on t/comp/retainedlines.t) and Eixo::Base passed all tests without a segfault.

demerphq commented 2 years ago

So I can give a summary of what is happening. Although I am definitely not at the bottom of the rabbit hole yet.

First, the implementation of Attributes::Handlers is clouding the issue, when it is modestly restructured the error moves out of the pp_padsv_store() into pp_gv() and we get an assert fail. I consider this an improvement, and it makes the -Dl easier to understand as it does not eval a bunch of strict and warings flag logic while doing the eval which is at the core of this bug.

diff --git a/dist/Attribute-Handlers/lib/Attribute/Handlers.pm b/dist/Attribute-Handlers/lib/Attribute/Handlers.pm
index 21f657dcb9..58fd9003a1 100644
--- a/dist/Attribute-Handlers/lib/Attribute/Handlers.pm
+++ b/dist/Attribute-Handlers/lib/Attribute/Handlers.pm
@@ -249,12 +249,18 @@ sub _apply_handler_AH_ {
        no warnings;
        if (!$raw && defined($data)) {
            if ($data ne '') {
-               my $evaled = eval("package $pkg; no warnings; no strict;
-                                  local \$SIG{__WARN__}=sub{die}; [$data]");
-               $data = $evaled unless $@;
+                local $SIG{__WARN__} = sub{ die @_ };
+                no warnings;
+                no strict;
+                my $code= "package $pkg; my \$ref= [$data]; \$data= \$ref; 1";
+                print STDERR "Evaling '$code'\n";
+                eval($code) or
+                    print STDERR "Code died: $@";
            }

I have some level of understanding of what is happening: attribute handlers uses an eval to attempt to convert the data inside of the parens into a perl data structure. Eg, if you did

sub concat : Sig(s, ARRAY) { }

then it will attempt to eval (in simplified terms):

[s, ARRAY]

this throws an error (Perl_croak) as perl parses it as an unterminated s/// operation. With c304acb applied after this error control returns to do_eval_compile(), which continues to execute the rest of the AH code. However crucially it seems that PL_restartop is set as a side-effect of this behavior, and later on when we try to execute run_body() we treat it as a restart after an eval fail even though it is not.

Without c304acb applied (or with it disabled), the Perl_croak does a longjmp back to Perl_call_sv(). Eg, it never returns to do_eval_compile(), nor to the existent run loop it was executed via, and instead jumps right out of the do_eval_compile() and skips all of that functions "failed to compile handling" and right back into Perl_call_sv(). Things then work out ok, but i believe only accidentally.

I am pretty sure that there is something buggy with attributes and attribute handlers, and that c304acb just reveals the problem that was hidden. In other words I think the bug that c304acb fixed somehow "cancelled out" the bug in the AH implementation, and that the AH code has always been buggy. It would seem we don't have any tests in the perl test suite for the case where the eval does die. Eg, [i,i] does not die, perl parses it as ["i","i"]. whereas [s, ARRAY] dies because it think it is a s,,, operator. (If we had tests for this we would have noticed this when working on c304acb itself.)

I believe that this is related to the comment in pp_ctl.c line 3355 or so for the "docatch" function:

/*
=for apidoc docatch

Interpose, for the current op and RUNOPS loop,

    - a new JMPENV stack catch frame, and
    - an inner RUNOPS loop to run all the remaining ops following the
      current PL_op.

Then handle any exceptions raised while in that loop.
For a caught eval at this level, re-enter the loop with the specified
restart op (i.e. the op following the OP_LEAVETRY etc); otherwise re-throw
the exception.

I haven't included the full docs, but this comment explains the use of docatch() and how CATCH_GET/CATCH_SET is supposed to be used. I surmise that something in the attribute handlers code does not do this correctly, and it just happens to work out right without c304acb, but with it the issue is exposed.

I am still working through this. Prior to this I had very little knowledge of Attribute::Handlers and im still not sure how all the pieces fit together yet. What i can see however is the PL_restartop stays set (or incorrectly set) after the attribute handlers inner eval fails, even though control was returned to the existent run loop (as we did not jump out of it as control returned back to do_eval_compile(), the difference between try_yyparse() and yyparse() is that try_yyparse() sets up the long jump frame data so that control returns to try_yyparse not some unknown place UP the stack from where it was invoked.

My guess is that when we get to the bottom of this we are going to find that something has not correctly followed the expectations for CATCH_GET/CATCH_SET in Attribute::Handlers.

BTW, a reduced version of this problem is as follows:

Code that fails with c304acb, note the s, ARRAY, but works in older perls (perhaps accidentally)

$ cat ~/scratch/t_eixo_bad.pl 
use Eixo::Base::Util;
sub concat : Sig(s, ARRAY) { }
warn "Done\n";

Code that works with c304acb and in older perls, note the i, ARRAY.

$ cat ~/scratch/t_eixo_ok.pl 
use Eixo::Base::Util;
sub concat : Sig(i, ARRAY) { }
warn "Done\n";

Reduced module with just a little of debug output to see what is going on:

$ cat ~/scratch/lib/Eixo/Base/Util.pm 
package Eixo::Base::Util;
use strict;
use warnings;

use Attribute::Handlers;
use Carp qw(cluck);
use Data::Dumper;

sub UNIVERSAL::Sig :ATTR(CODE) { 
    #cluck("In 'Sig': ", Dumper(\@_));
}
1;

with this I can reproduce the issues with:

./perl -Ilib -I/home/yorton/scratch/lib ~/scratch/t_eixo_bad.pl

from inside of a perl build directory.

Anyway, I am still looking, and would welcome any help that folks might have. @iabyn? @bram-perl?

demerphq commented 2 years ago

BTW, this ticket is getting really confusing. It would seem that the "stop on first syntax error" patch causes at least two classes of test fail:

Tests that hard code perl error messages in its tests and because this sequence changes how many messages might be displayed the tests fail. The answer to this is for the test authors to fix their tests to account for the change, likely by making the test look for a specific message fragment instead of the full set of error messages displayed.
Code that uses Attribute::Handlers, and especially tests for that code which use attribute syntax that is not valid perl.
Anything else?

It would be helpful if we could separate out the cases and deal with them in separate tickets. Item 1 I listed is not a bug in perl and not something we should care about beyond informing the module user they are doing non-forward-compatible testing of the perl error messages and we make no commitment for what or how many error message we might emit from given compilation failure.

Issue 2 however is a real and serious bug which I am working on.

@jkeenan normally you manage these things, it makes sense to me to leave this ticket for the item 1 bugs and create a new ticket for item 2 bugs. What do you think?

jkeenan commented 2 years ago

BTW, this ticket is getting really confusing. It would seem that the "stop on first syntax error" patch causes at least two classes of test fail:
1. Tests that hard code perl error messages in its tests and because this sequence changes how many messages might be displayed the tests fail. The answer to this is for the test authors to fix their tests to account for the change, likely by making the test look for a specific message fragment instead of the full set of error messages displayed.

2. Code that uses Attribute::Handlers, and especially tests for that code which use attribute syntax that is not valid perl.

3. Anything else?
It would be helpful if we could separate out the cases and deal with them in separate tickets. Item 1 I listed is not a bug in perl and not something we should care about beyond informing the module user they are doing non-forward-compatible testing of the perl error messages and we make no commitment for what or how many error message we might emit from given compilation failure.

Issue 2 however is a real and serious bug which I am working on.

@jkeenan normally you manage these things, it makes sense to me to leave this ticket for the item 1 bugs and create a new ticket for item 2 bugs. What do you think?

If you can open a new ticket where the description of the bug is clearly not expressed in "this commit broke CPAN" terms -- i.e., the description is focused on Attribute::Handlers, then I think that would be a good way to proceed.

demerphq commented 2 years ago

I have created https://github.com/Perl/perl5/issues/20396 for the bugs related to Atttribute::Handlers. We can leave this one as the ticket for hard coded error messages. @jkeenan, is that good for you?

jkeenan commented 2 years ago

On 10/14/22 08:24, Yves Orton wrote:

I have created #20396 https://github.com/Perl/perl5/issues/20396 for the bugs related to Atttribute::Handlers. We can leave this one as the ticket for hard coded error messages. @jkeenan https://github.com/jkeenan, is that good for you?

Yes. Thanks.

demerphq commented 2 years ago

The attribute handlers bugs should be fixed by https://github.com/Perl/perl5/pull/20398 which I am waiting to merge.

jkeenan commented 2 years ago

The attribute handlers bugs should be fixed by #20398 which I am waiting to merge.

Of the CPAN distributions mentioned in this BBC ticket, here is what I'm seeing as of Oct 25 / v5.37.5-39-gfd7d660c03 / FreeBSD-12 / threaded build

PASS Eixo-Base # separate ticket closed Module-Extract-VERSION # separate ticket closed BenchmarkAnything::Schema Sub::Documentation Sub::Quote

FAIL HTML::Mason Mojolicious::Plugin::RenderCGI Quote::Code Quote::Ref Text::PerlPP

tonycoz commented 2 years ago

I believe all the failures here need patches pushed to CPAN.

Mojo::Plugin::RenderCGI is no longer producing errors in valgrind.

demerphq commented 2 years ago

HTML::Mason -> https://github.com/houseabsolute/HTML-Mason/pull/34/files Mojolicious::Plugin::RenderCGI -> https://github.com/mche/Mojolicious-Plugin-RenderCGI/pull/1 Quote::Code -> https://github.com/mauke/Quote-Code/pull/1 Quote::Ref -> https://github.com/mauke/Quote-Ref/pull/1

That leaves Text::PerlPP, ill take care of that today.

demerphq commented 2 years ago

Text:PerlPP -> https://github.com/interpreters/perlpp/pull/32

cxw42 commented 2 years ago

Text::PerlPP 0.600.2 released with the fix. Thanks @demerphq !

andk commented 1 year ago

Also affected: COUDOT/Lemonldap-NG-Handler-2.0.15.1.tar.gz Report: http://www.cpantesters.org/cpan/report/30aaabe8-8904-11ed-9c1f-e07132f055e5

rjbs commented 1 year ago

I believe this is resolved.

jkeenan commented 1 year ago

I believe this is resolved.

Have you examined CPANtesters results for Lemonldap-NG-Handler, the last distro reported on by @andk in this ticket? It's still getting failures. (Note that diagnosis is complicated by the fact that this distro has a prerequisite, Lemonldap-NG-Common, whose tests may fail because its prerequisite, XML::LibXML, is not mentioned in its Makefile.PL.)

jkeenan commented 1 year ago

I believe this is resolved.

Have you examined CPANtesters results for Lemonldap-NG-Handler, the last distro reported on by @andk in this ticket? It's still getting failures. (Note that diagnosis is complicated by the fact that this distro has a prerequisite, Lemonldap-NG-Common, whose tests may fail because its prerequisite, XML::LibXML, is not mentioned in its Makefile.PL.)

I have filed a ticket for this problem in the upstream issue tracker at https://gitlab.ow2.org/lemonldap-ng/lemonldap-ng/-/issues/2913. If anyone understands the problem at hand, understands using gitlab and has an ow2 account, please feel free to provide a patch.

mauke commented 4 months ago

Lemonldap-NG-Handler looks fixed. Is there anything left to do in this ticket?

jkeenan commented 4 months ago

Lemonldap-NG-Handler looks fixed. Is there anything left to do in this ticket?

(Sigh); it's still showing a lot of red at http://fast-matrix.cpantesters.org/?dist=Lemonldap-NG-Handler%202.0.16. In fact, it's not showing any results past perl-5.38.0. I think that's because it's prerequisite, Lemonldap-NG-Common, is also showing intermittent failures. In that distro, there are two tests that require Time::Fake -- but Time::Fake is not specified as a prerequisite in the distro's Makefile.PL.

I have opened another ticket upstream with a diff that adds Time::Fake to Makefile.PL: https://gitlab.ow2.org/lemonldap-ng/lemonldap-ng/-/issues/3201

Since I don't think there's anything else P5P really must do at this moment, I'm closing this ticket. Someone can file a new BBC ticket once we get new data.

Perl / perl5

BBC: Blead breaks MAUKE/Quote-Code and others #20346