Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.98k stars 559 forks source link

5.40.0 onwards won't build Darwin fat binaries #22466

Open gsteemso opened 3 months ago

gsteemso commented 3 months ago

Description All Perls which I have built thus far will correctly build as NeXT-style fat (multi-architecture) binaries when the directions in README.macosx are followed – except the newest ones, beginning with version 5.40.0, which fail messily at the make step (apparently, by applying the variable-size expectations appropriate to 64-bit sub-builds to the corresponding variables in 32-bit sub-builds). It generates thousands of lines of compiler warnings that a bit shift has exceeded the width of the data type, culminating in a fatal error involving a negative bit-field width.

Steps to Reproduce Follow the instructions in README.macosx. Run ./Configure, then make. You can’t run make test and make install because it errors out during make.

The ./Configure line I used was:

Prefix=/Users/gsteemso/devel/perl/built/5.40.0u
Sdk=/Developer/SDKs/MacOSX10.5.sdk
./Configure -des -Dprefix=${Prefix} -Uvendorprefix= -Dprivlib=${Prefix}/lib -Darchlib=${Prefix}/lib -Dman1dir=${Prefix}/share/man/man1 -Dmsn3dir=${Prefix}/share/man/man3 -Dman3ext=3pl -Doptimize=-Os -Dsitearch=${Prefix}/lib/site_perl -Dsitelib=${Prefix}/lib/site_perl -Dperladmin=none -Dstartperl='#!/usr/local/opt/perl/bin/perl' -Duseshrplib -Duselargefiles -Dusenm -Dusethreads -Accflags="-DNO_MATHOMS -mcpu=970 -arch ppc -arch ppc64 -nostdinc -B${Sdk}/usr/include/gcc -B${Sdk}/usr/lib/gcc -isystem${Sdk}/usr/include -F${Sdk}/System/Library/Frameworks" -Aldflags="-arch ppc -arch ppc64 -Wl,-syslibroot,${Sdk}"

Expected behavior Perl should be constructed and installed as per usual, with all compiled code built as fat binaries in the manner normal for Macs.

Perl configuration The configuration cannot be extracted because Perl never finishes building, but the corresponding one for a pure ppc64 build looks like this:

Summary of my perl5 (revision 5 version 40 subversion 0) configuration:

 Platform:
   osname=darwin
   osvers=9.8.0
   archname=darwin-thread-multi-2level
   uname='darwin nosferalto.local 9.8.0 darwin kernel version 9.8.0:
wed jul 15 16:57:01 pdt 2009; root:xnu-1228.15.4~1release_ppc power
macintosh '
   config_args='-des -Dprefix=/Users/gsteemso/devel/perl/built/5.40.0
-Uvendorprefix= -Dprivlib=/Users/gsteemso/devel/perl/built/5.40.0/lib
-Darchlib=/Users/gsteemso/devel/perl/built/5.40.0/lib
-Dman1dir=/Users/gsteemso/devel/perl/built/5.40.0/share/man/man1
-Dman3dir=/Users/gsteemso/devel/perl/built/5.40.0/share/man/man3
-Dman3ext=3pl -Doptimize=-Os
-Dsitearch=/Users/gsteemso/devel/perl/built/5.40.0/lib/site_perl
-Dsitelib=/Users/gsteemso/devel/perl/built/5.40.0/lib/site_perl
-Dperladmin=none -Dstartperl=#!/usr/local/opt/perl/bin/perl
-Duseshrplib -Duselargefiles -Dusenm -Dusethreads -Duse64bitall
-Accflags=-DNO_MATHOMS -mcpu=970 -arch ppc64 -nostdinc
-B/Developer/SDKs/MacOSX10.5.sdk/usr/include/gcc
-B/Developer/SDKs/MacOSX10.5.sdk/usr/lib/gcc
-isystem/Developer/SDKs/MacOSX10.5.sdk/usr/include
-F/Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks
-Aldflags=-arch ppc64 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk
-Alddlflags=-mmacosx-version-min=10.5 -arch ppc64 -bundle -undefined
dynamic_lookup -L/usr/local/lib -fstack-protector'
   hint=recommended
   useposix=true
   d_sigaction=define
   useithreads=define
   usemultiplicity=define
   use64bitint=define
   use64bitall=define
   uselongdouble=undef
   usemymalloc=n
   default_inc_excludes_dot=define
 Compiler:
   cc='cc'
   ccflags ='-std=gnu99 -fno-common -DPERL_DARWIN
-mmacosx-version-min=10.5 -DNO_THREAD_SAFE_QUERYLOCALE
-DNO_POSIX_2008_LOCALE -arch ppc64 -DNO_MATHOMS -mcpu=970 -arch ppc64
-nostdinc -B/Developer/SDKs/MacOSX10.5.sdk/usr/include/gcc
-B/Developer/SDKs/MacOSX10.5.sdk/usr/lib/gcc
-isystem/Developer/SDKs/MacOSX10.5.sdk/usr/include
-F/Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_FORTIFY_SOURCE=2'
   optimize='-Os'
   cppflags='-arch ppc64 -std=gnu99 -fno-common -DPERL_DARWIN
-mmacosx-version-min=10.5 -DNO_THREAD_SAFE_QUERYLOCALE
-DNO_POSIX_2008_LOCALE -arch ppc64 -DNO_MATHOMS -mcpu=970 -arch ppc64
-nostdinc -B/Developer/SDKs/MacOSX10.5.sdk/usr/include/gcc
-B/Developer/SDKs/MacOSX10.5.sdk/usr/lib/gcc
-isystem/Developer/SDKs/MacOSX10.5.sdk/usr/include
-F/Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
   ccversion=''
   gccversion='4.2.1 (Apple Inc. build 5666) (dot 3)'
   gccosandvers=''
   intsize=4
   longsize=8
   ptrsize=8
   doublesize=8
   byteorder=87654321
   doublekind=4
   d_longlong=define
   longlongsize=8
   d_longdbl=define
   longdblsize=16
   longdblkind=6
   ivtype='long'
   ivsize=8
   nvtype='double'
   nvsize=8
   Off_t='off_t'
   lseeksize=8
   alignbytes=8
   prototype=define
 Linker and Libraries:
   ld='cc -arch ppc64'
   ldflags =' -mmacosx-version-min=10.5 -arch ppc64 -arch ppc64
-Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -fstack-protector
-L/usr/local/lib'
   libpth=/usr/local/lib /usr/lib
   libs=-lpthread -ldbm -ldl -lm -lutil -lc
   perllibs=-lpthread -ldl -lm -lutil -lc
   libc=/usr/lib/libc.dylib
   so=dylib
   useshrplib=true
   libperl=libperl.dylib
   gnulibc_version=''
 Dynamic Linking:
   dlsrc=dl_dlopen.xs
   dlext=bundle
   d_dlsymun=undef
   ccdlflags=' '
   cccdlflags=' '
   lddlflags=' -mmacosx-version-min=10.5 -bundle -undefined
dynamic_lookup -mmacosx-version-min=10.5 -arch ppc64 -bundle
-undefined dynamic_lookup -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl):
 Compile-time options:
   HAS_LONG_DOUBLE
   HAS_STRTOLD
   HAS_TIMES
   MULTIPLICITY
   NO_MATHOMS
   PERLIO_LAYERS
   PERL_COPY_ON_WRITE
   PERL_DONT_CREATE_GVSV
   PERL_HASH_FUNC_SIPHASH13
   PERL_HASH_USE_SBOX32
   PERL_MALLOC_WRAP
   PERL_OP_PARENT
   PERL_PRESERVE_IVUV
   PERL_USE_SAFE_PUTENV
   USE_64_BIT_ALL
   USE_64_BIT_INT
   USE_ITHREADS
   USE_LARGE_FILES
   USE_LOCALE
   USE_LOCALE_COLLATE
   USE_LOCALE_CTYPE
   USE_LOCALE_NUMERIC
   USE_LOCALE_TIME
   USE_PERLIO
   USE_PERL_ATOF
   USE_REENTRANT_API
 Built under darwin
 Compiled at Aug  4 2024 15:00:48
 @INC:
   /Users/gsteemso/devel/perl/built/5.40.0/lib/site_perl
   /Users/gsteemso/devel/perl/built/5.40.0/lib

(It should be noted that, in a pure ppc64 build done without the aid of a package manager, -Alddlflags=xxxxx must also be set by hand on the ./Configure command line, because extensions' shared-library makefiles fail to propagate the compiler flags that tell it which variant of the CPU architecture to target. Without that change, the individual .o files are still built correctly, but their coalescence into library .bundles is botched.)

tonycoz commented 3 months ago

Could you please attach a build log and the generated config.sh?

gsteemso commented 3 months ago

Please see these attachments. There should be 5. I included config.sh (I had to rename it for Github) and the stdout and stderr captures for each of Configure and make. config.sh.txt Configure.stdout.log Configure.stderr.log make.stdout.log make.stderr.log

gsteemso commented 3 months ago

I should add that something went a bit odd a couple of days ago, such that a lot of GCC's usual stderr output has stopped appearing. I'm still trying to find anything that changed.

jkeenan commented 3 months ago

Please see these attachments. There should be 5. I included config.sh (I had to rename it for Github) and the stdout and stderr captures for each of Configure and make. config.sh.txt Configure.stdout.log Configure.stderr.log make.stdout.log make.stderr.log

I have no particular expertise in this area. However, it occurs to me that since you are getting a segfault as early in the process as ./Configure, you could begin by getting a tarball of perl-5.38, configuring with the same arguments as previously, and seeing whether ./Configure completes successfully and segfault-free. That would open up the possibility of bisection.

gsteemso commented 3 months ago

I believe the segfault during ./Configure is probably an expected failure resulting from an unsuccessful test, because it does not seem to bother it any. The build process continues unimpeded until the big halt in what ought to be the middle of that 'make' run.

I can already tell you that 5.38.x builds successfully with the same parameters, as do all earlier versions that I tried. That 5.40.0 et seq do not build successfully when all others did before is the entire problem here.

jkeenan commented 3 months ago

I can already tell you that 5.38.x builds successfully with the same parameters, as do all earlier versions that I tried. That 5.40.0 et seq do not build successfully when all others did before is the entire problem here.

So in principle this is bisectable, with (roughly) these steps:

tonycoz commented 3 months ago

Please try adding -Duse64bitint to the Configure command-line to ensure both 32-bit and 64-bit builds are using the same sized UV and IV types, I suspect they're different here causing the static assertion to fail.

I'm able to compile a -arch x86_64 -arch arm64 build, but those are both 64-bit builds, so there's no type size mismatches.

gsteemso commented 3 months ago

Well, I have a few things to report.

• Adding -Duse64bitint did not help. I can’t imagine why not – as was pointed out, it ought to make things the same size internally. (Of course, even if it had worked, the resulting executable would not be transportable to lesser Macs – defeating a large part of the purpose of building a fat binary in the first place. It’s still bizarre that it didn’t work, of course.)

• The thousands of lines of warnings about a bit shift exceeding the width of the type actually, it turns out, also occur on a successful build; so I believe the hypothesis about it being due to the size difference between compiler runs is likely correct. I have set up and am currently running the suggested bisection to figure out what change made it start having actual build problems with the perceived mismatch.

This is a fast machine for its age but that age is 20 years. I will almost certainly not get answers from the bisection before tomorrow (Friday), and quite possibly not until Saturday.

gsteemso commented 3 months ago

Apparently I spoke too soon. That bisection program is genius! in 2602 seconds, it determined that the first commit to cause a failure was 1e3b3238f23137440041d8883e041e4da74876f5, dated March 13th 2024.

The command line I fed to bisect was:

../othergit/Porting/bisect.pl --test-build --target=miniperl --start=v5.38.0 --end=v5.40.0 -Dprefix=../built -Uvendorprefix= -Dperladmin=none -Duseshrplib -Duselargefiles -Dusenm --Dusethreads -Accflags='-DNO_MATHOMS -arch ppc -arch ppc64 -nostdinc -B/Developer/SDKs/MacOSX10.5.sdk/usr/include/gcc -B/Developer/SDKs/MacOSX10.5.sdk/usr/lib/gcc -isystem/Developer/SDKs/MacOSX10.5.sdk/usr/include -F/Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks' -Aldflags='-arch ppc -arch ppc64 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk'

I hope this all means something useful to someone.

mauke commented 3 months ago

I don't see how -Duse64bitint could help. If you run Configure on a 64-bit platform, it will just see that sizeof (long) == 8 and use that, hardcoding #define IVTYPE long in config.h. Plus we have INTSIZE, LONGSIZE, SHORTSIZE all hardcoded/configured in config.h.

As far as I can tell, building for different architectures requires different configs.

jkeenan commented 3 months ago

Apparently I spoke too soon. That bisection program is genius! in 2602 seconds, it determined that the first commit to cause a failure was 1e3b323, dated March 13th 2024.

The command line I fed to bisect was:

../othergit/Porting/bisect.pl --test-build --target=miniperl --start=v5.38.0 --end=v5.40.0 -Dprefix=../built -Uvendorprefix= -Dperladmin=none -Duseshrplib -Duselargefiles -Dusenm --Dusethreads -Accflags='-DNO_MATHOMS -arch ppc -arch ppc64 -nostdinc -B/Developer/SDKs/MacOSX10.5.sdk/usr/include/gcc -B/Developer/SDKs/MacOSX10.5.sdk/usr/lib/gcc -isystem/Developer/SDKs/MacOSX10.5.sdk/usr/include -F/Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks' -Aldflags='-arch ppc -arch ppc64 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk'

I hope this all means something useful to someone.

A number of points ...

gsteemso commented 2 months ago

mauke, the Mac-specific aspects of this are a bit odd by everyone else's standards, because they build for all listed architectures simultaneously -- there is only one Configure run for ALL of them combined, not one per architecture as I understand might be done, for example, under Linux. That's why we thought -Duse64bitint might have made it build successfully.

jkeenan, in order: • you're correct, I ran that with one hyphen and accidentally typo'd when copying the line into my email. • --test-build and --target=xxx are required to both be used in this case. According to the documentation, without the first a separate test case must be specified (it gets run after the build succeeds, which is expected to happen every time, and would not here), and without the second, successfully completing the build is assumed (possibly the source of your misapprehension). When I tried running it using only --target=xxxx, it refused to run at all and merely gave me back the usage instructions (I hadn't specified a test case). • you were absolutely correct that I specified more options than were required. I am rerunning the bisection with no -D options at all except -Dprefix=xxxx (to prevent it overwriting my system Perl), and only those -A options listed as being necessary for a Mac-style multi-arch build (both of them, unfortunately, but if you look closely you'll see that nearly all the given components do nothing except tell the compiler where the system libraries are).

gsteemso commented 2 months ago

I should add that I had to restrict the test builds to only try as far as building miniperl, because various of the library modules require an assortment of trivial patches to build correctly under specific versions of Perl. Luckily the failure occurs during that early phase, so I did not need to muck about trying to tell it to apply a patch only during builds of certain versions.

mauke commented 2 months ago

they build for all listed architectures simultaneously -- there is only one Configure run for ALL of them combined, not one per architecture

The only way that could work is if IVTYPE = int64_t and UVTYPE = uint64_t, but there is no way to force Configure to choose those as far as I can see.

gsteemso commented 2 months ago

I won't pretend I understand how it works. The Mac compiler that was current at that time, and which I am using now, was a modified version of GCC 4.2.1. It could be given any number of “-arch xxxx” parameters and would then repeat each compilation with all of the other parameters the same, but a distinct target platform; then stitch the results into a universal binary using a tool with the amusing name “lipo” (because it was often used to slim a fat binary down to a single-architecture slice). The five platforms then current were “ppc” (32-bit PowerPC), “i386” (32-bit x86 – they were all lumped together as “i386” even though the cross-compiler, for example, had a prefix containing “686”), “arm” (32-bit ARM as used in early iPhones), “ppc64” (64-bit PowerPC, which consisted solely of the IBM PowerPC 970), and “x86_64” (exactly what it says). Of those, only two were even able to handle 64-bit data in a single action; yet Perl has historically been able to compile with 32- and 64-bit values (IVs, UVs, etc.) simultaneously. At first glance I'd have assumed it just compiled everything with 32-bit NVs, but Configure does in fact appear to take 64-bit platforms as 64-bit. I have no idea how it works but it did up until, as “bisect” has once again informed me, the same commit I named earlier.

gsteemso commented 2 months ago

Reverting that one commit against blead had no effect.

tonycoz commented 2 months ago

I don't see how -Duse64bitint could help. If you run Configure on a 64-bit platform, it will just see that sizeof (long) == 8 and use that, hardcoding #define IVTYPE long in config.h. Plus we have INTSIZE, LONGSIZE, SHORTSIZE all hardcoded/configured in config.h.

As far as I can tell, building for different architectures requires different configs.

I remember building multiarch with i386 and x86_64 in one binary.

There were definitely some config issues, Configure has a darwin specific check to ensure alignbytes is at least 8 on darwin.

I never looked too hard at it.

I suspect this case isn't so much a new bug, but the static assert detecting an old bug.

gsteemso commented 2 months ago

I'm not actually certain there is a true bug, here. Is it plausible that the assert is framed in such a way that it gets a false positive from the disparity in word sizes between built-for architectures?

tonycoz commented 2 months ago

I'm not actually certain there is a true bug, here.

If that particular assertion fails the code following won't be valid., it may lose precision when converting from an NV to an IV but report an exact conversion.

gsteemso commented 2 months ago

Let me rephrase my supposition. I'm aware that's the purpose of the assert. The reason the assert is failing is that, on a 32-bit build, the total size of the NV is smaller than the (for a 64-bit build) reported size that is transferrable with full accuracy. (I think I got that straight, there are something like six different figures involved for three different quantities, or thereabouts.) The same figures being used for both 32- and 64-bit builds – which happen simultaneously in a universal binary – are, I believe, causing a "false positive" (false negative?) assertion failure.

gsteemso commented 2 months ago

The assert should be passing during the 64-bit build pass and incorrectly failing during the 32-bit build pass.