Closed andk closed 2 years ago
Many thanks for the analysis - will look at it asap.
I checked the upstream (Marpa::R2 10.0.0) for 5.36.0 GNU/Linux, and it looks clean.
I do not suspect libmarpa, though thanks to indicate this MarpaX::ESLIF layer looks clean ;)
For the record, all these fails happened with libc6 2.34 on debian sid. There's a chance that the whole problem stems from this version of libc.
@andk indeed I am on debian sid as well, with this version of libc6:
ii libc6:i386 2.34-8 i386 GNU C Library: Shared libraries
Note this is an x86 OS, not x86_64.
I have perlbrewed the latest perl with this configuration:
Summary of my perl5 (revision 5 version 36 subversion 0) configuration:
Platform:
osname=linux
osvers=5.19.0-1-686-pae
archname=i686-linux-thread-multi
uname='linux jddwww 5.19.0-1-686-pae #1 smp preempt_dynamic debian 5.19.6-1 (2022-09-01) i686 gnulinux '
config_args='-de -Dprefix=/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble -Dusethreads -Doptimize=-g -DDEBUGGING -Dusemultiplicity=define -Duselongdouble=undef -Aeval:scriptdir=/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/bin'
hint=recommended
useposix=true
d_sigaction=define
useithreads=define
usemultiplicity=define
use64bitint=undef
use64bitall=undef
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
Compiler:
cc='cc'
ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-g'
cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='12.2.0'
gccosandvers=''
intsize=4
longsize=4
ptrsize=4
doublesize=8
byteorder=1234
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=12
longdblkind=3
ivtype='long'
ivsize=4
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=4
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib /lib64 /usr/lib64
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/i386-linux-gnu/libc.so.6
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.34'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -g -L/usr/local/lib -fstack-protector-strong'
Characteristics of this binary (from libperl):
Compile-time options:
DEBUGGING
HAS_TIMES
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_TRACK_MEMPOOL
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
USE_REENTRANT_API
USE_THREAD_SAFE_LOCALE
Built under linux
Compiled at Sep 20 2022 07:26:12
%ENV:
PERLBREW_HOME="/home/jdurand/.perlbrew"
PERLBREW_MANPATH="/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/man"
PERLBREW_PATH="/home/jdurand/perl5/perlbrew/bin:/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/bin"
PERLBREW_PERL="perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble"
PERLBREW_ROOT="/home/jdurand/perl5/perlbrew"
PERLBREW_SHELLRC_VERSION="0.96"
PERLBREW_VERSION="0.96"
@INC:
/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/lib/site_perl/5.36.0/i686-linux-thread-multi
/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/lib/site_perl/5.36.0
/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/lib/5.36.0/i686-linux-thread-multi
/home/jdurand/perl5/perlbrew/perls/perl-5.36.0-thread-debug-DEBUGGING-usemultiplicity-nouselongdouble/lib/5.36.0
and... it installs ok.
Now I am not on x86_64. Would you mind to do the following? In a cpan shell:
look MarpaX::ESLIF
then open Makefile.PL and uncomment the line:
# goto no_tweak_on_optimization_flags;
This will shut off -O3 compilation flag on the c-marpaESLIF library, falling back to perl's default that is -O2 -g. Then:
perl Makefile.PL && make test
If it does not crash anymore, this could indicate an -O3
optimization bug on this platform.
If it still crashes I would be glad if you can execute valgrind
on any of the test. Either via Test::Valgrind or directly valgrind, though I prefer a direct valgrind :) e.g.
valgrind perl -I blib/lib -I blib/arch t/import_export.t
Please note that lines like:
==26726== Conditional jump or move depends on uninitialised value(s)
==26726== at 0x8B6A5B2: ???
are unfortunately normal. This come from JIT's PCRE2, and I do not compile it with valgrind support.
Many thanks for your help.
It still crashed for me with -O2. I paste here the output of valgrind with all the "Conditional jump or move..." messages removed; let me know if you need anything else.
>sand@k93msid:/tmp/loop_over_bdir-bQg7EB/MarpaX-ESLIF-6.0.26-1% valgrind /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.37.3/29??/bin/perl -I blib/lib -I blib/arch t/import_export.t
==704342== Memcheck, a memory error detector
==704342== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==704342== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==704342== Command: /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.37.3/29fb/bin/perl -I blib/lib -I blib/arch t/import_export.t
==704342==
1..31
ok 1 - require MarpaX::ESLIF;
[...]
==704342== Process terminating with default action of signal 11 (SIGSEGV)
==704342== General Protection Fault
==704342== at 0x8F99175: marpaESLIFPerl_recognizerContextInitv (ESLIF.xs:1641)
==704342== by 0x8F99E2E: XS_MarpaX__ESLIF__Recognizer_allocate (ESLIF.xs:4856)
==704342== by 0x285063: Perl_pp_entersub (pp_hot.c:5402)
==704342== by 0x239879: Perl_runops_debug (dump.c:2677)
==704342== by 0x185A5E: S_run_body (perl.c:2775)
==704342== by 0x185A5E: perl_run (perl.c:2703)
==704342== by 0x14D481: main (perlmain.c:107)
==704342==
==704342== HEAP SUMMARY:
==704342== in use at exit: 138,942,402 bytes in 140,883 blocks
==704342== total heap usage: 1,620,470 allocs, 1,479,587 frees, 1,389,699,927 bytes allocated
==704342==
==704342== LEAK SUMMARY:
==704342== definitely lost: 0 bytes in 0 blocks
==704342== indirectly lost: 0 bytes in 0 blocks
==704342== possibly lost: 28,625,734 bytes in 41,177 blocks
==704342== still reachable: 110,316,668 bytes in 99,706 blocks
==704342== of which reachable via heuristic:
==704342== newarray : 136,688 bytes in 4,193 blocks
==704342== suppressed: 0 bytes in 0 blocks
==704342== Rerun with --leak-check=full to see details of leaked memory
==704342==
==704342== Use --track-origins=yes to see where uninitialised values come from
==704342== For lists of detected and suppressed errors, rerun with: -s
==704342== ERROR SUMMARY: 870 errors from 862 contexts (suppressed: 0 from 0)
zsh: segmentation fault valgrind -I blib/lib -I blib/arch t/import_export.t
@andk thanks for this addition information. I admit I am puzzled. The only thing that comes to my mind is to switch to my 64bits box, virtualize a 64bits debian sid, and redo the exercice. Will keep you informed.
Just for the record, the whole lib/ and ESLIF.xs are the same between version 6.0.25 and 6.0.26 So it is elsewhere. The only thing that changed a lot is the Lua bindings, but at the stage of your crash, this should not have been invovled IMHO. To be confirmed.
Quite apparently it started with libc 2.34. I just tried 6.0.25 and got a fail on the same configuration.
FYI I reproduced the crash at exactly the same place with the latest debian amd64, prebrewed latest perl with -D DEBUGGING=both -D usemultiplicity=define -D uselongdouble=undef
, this is libc-2.35-1.
A strange backtrace, nothing really helpful.
Will try to understand (I am thinking to stack size issue but ahem let's say I hope it would be this because I have no other idea at the moment :)).
And of course, adding -ggdb -fsanitize=address -fno-omit-frame-pointer
to ESLIF.c compilation, then running perl with LD_PRELOAD=$(gcc -print-file-name=libasan.so)
, guess what, it does not crash anymore grrr.
I confirm this is a stack size issue. Will be fixed in the next release.
Was the Lua stack the problem?
I was refering to the C frame stack size. I try sometimes to declare things and propagate their pointers, in order to avoid a malloc call.
Sample fail report: http://www.cpantesters.org/cpan/report/7605e376-36be-11ed-993c-b921912ee776
These SEGVs are reproducable and happen with many perl versions but only with few configurations. -DDEBUGGING and usemultiplicity=define and uselongdouble=undef seem required configuration options.
Sample production of a core file:
The stacktrace: