Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.85k stars 527 forks source link

cygwin: as of the 5.39.10 version bump perl fails to fork with mro loaded #22104

Open tonycoz opened 1 month ago

tonycoz commented 1 month ago

Module:

Description

Since the 5.39.10 version bump CI has been failing on Cygwin with errors like:

      0 [main] perl 11614 child_info_fork::abort: address space needed by 'mro.dll' (0x190000) is already occupied
Can't fork, trying again in 5 seconds at t/lib/MakeMaker/Test/Utils.pm line 325.

I've managed to reproduce this locally and consistently with current cygwin, though with a different address:

tony@enceladus ~/dev/perl/git/perl
$ ./perl -Ilib -Mmro -efork
      0 [main] perl 34054 child_info_fork::abort: address space needed by 'mro.dll' (0x400000) is already occupied

This is likely caused by a conflict between cygperl5._39_10.dll and mro.dll:

$ rebase -i `find . -name '*.dll'` ./perl.exe | grep -F '*'
/home/tony/dev/perl/git/perl/cygperl5_39_10.dll                         base 0x00041db50000 size 0x00b4c000 *
/home/tony/dev/perl/git/perl/lib/auto/mro/mro.dll                       base 0x00041e510000 size 0x0002c000 *

Building without -DDEBUGGING does not fail to work, probably because the DLL uses less address space and hence there's no conflict:

# without -DDEBUGGING
/home/tony/dev/perl/git/perl/cygperl5_39_10.dll                         base 0x00041db50000 size 0x0081e000
/home/tony/dev/perl/git/perl/lib/auto/mro/mro.dll                       base 0x00041e510000 size 0x00029000

I think this was caused by the name change for the cygperl DLL introduced by the version bump.

We use --enable-auto-image-base in the perl and module makefiles to generate the base addresses of DLLs in the perl build in cygwin. This generates the DLL bases addresses based on a hash of the DLL name, resulting in the conflict here.

Steps to Reproduce

  1. On cygwin, build perl with:

    ./Configure -des -Dusedevel -DDEBUGGING -Doptimize=-g
    make test-prep
  2. fork with mro loaded:

    ./perl -Ilib -Mmro -efork

Expected behavior

fork() is successful.

Perl configuration

# perl -V output goes here
Summary of my perl5 (revision 5 version 39 subversion 10) configuration:
  Commit id: e37d0248e80396cf27564f6205584d2024df704d
  Platform:
    osname=cygwin
    osvers=3.5.1-1.x86_64
    archname=cygwin-thread-multi
    uname='cygwin_nt-10.0-19045 enceladus 3.5.1-1.x86_64 2024-02-27 11:54 utc x86_64 cygwin '
    config_args='-des -Dusedevel -Doptimize=-g -DDEBUGGING'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='gcc'
    ccflags ='-U__STRICT_ANSI__ -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong'
    optimize='-g'
    cppflags='-U__STRICT_ANSI__ -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong'
    ccversion=''
    gccversion='11.4.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='g++'
    ldflags =' -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--enable-auto-image-base -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/lib /usr/lib/w32api /usr/local/lib /lib
    libs=-lpthread -ldl -lcrypt
    perllibs=-lpthread -ldl -lcrypt
    libc=/usr/lib/libcygwin.a
    so=dll
    useshrplib=true
    libperl=cygperl5_39_10.dll
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=dll
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' --shared  -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--enable-auto-image-base -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl):
  Compile-time options:
    DEBUGGING
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_TRACK_MEMPOOL
    PERL_USE_DEVEL
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
  Built under cygwin
  Compiled at Mar 27 2024 15:03:04
  %ENV:
    CYGWIN="detect_bloda"
  @INC:
    lib
    /usr/local/lib/perl5/site_perl/5.39.10/cygwin-thread-multi
    /usr/local/lib/perl5/site_perl/5.39.10
    /usr/local/lib/perl5/5.39.10/cygwin-thread-multi
    /usr/local/lib/perl5/5.39.10
haarg commented 1 month ago

Do we know what kind of hashing is used for the base address, and how likely conflicts are in general? It sounds like if we waited until we bumped the version again, the precise failure we're seeing here would disappear. But if the hashing is meant to prevent this issue, it seems odd that we're seeing it now. Did we just get incredibly unlucky, or has something else changed that would make this type of failure more likely?

tonycoz commented 1 month ago

The hash doesn't look like anything complex.

I think we just ran into a hash collision, with 64-bits it's unlikely to happen, but it has here. I expect it's even worse for 32-bit cygwin, but we don't test that, and cygwin no longer support it.

The base calculation uses 17 bits of the hash (vs 10 for 32-bit), so a simple direct collision I think it's about 1 in 89 chance of a direct conflict.

I'm working on a workaround fix for the CI failure (adding -Dstatic_ext) and a note for perldelta.