Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.93k stars 552 forks source link

t/porting/libperl.t failure with GCC 12 and -flto #20518

Closed ntyni closed 1 year ago

ntyni commented 1 year ago

This is a bug report for perl from ntyni@debian.org, generated with the help of perlbug 1.43 running under perl 5.37.6.


Description

As reported by Matthias Klose in https://bugs.debian.org/985884 t/porting/libperl.t fails when perl is built with link time optimization (LTO).

Steps to Reproduce

On current Debian unstable, with gcc (Debian 12.2.0-9) 12.2.0 and perl v5.37.5-162-g52917b368f

$ ./Configure -des -Dusedevel -Dccflags=-flto -Dldflags=-flto && make -j4 && make test
[...]
t/porting/libperl ................................................ # Failed test 3 - has data const symbols at porting/libperl.t line 322
# Failed test 4 - has PL_no_mem at porting/libperl.t line 323
FAILED at test 3
[...]
Failed 1 test out of 2502, 99.96% okay.
    porting/libperl.t

$ ./perl -Ilib t/porting/libperl.t 
# $^O = linux
# $Config{archname} = x86_64-linux
# $Config{cc} = cc
# libperl = ../libperl.a
# nm = /usr/bin/nm
# nm_style = gnu
# nm_opt = 
# command: "/usr/bin/nm  ../libperl.a 2>libperl564590 |"
ok 1 - has object util.o
ok 2 - has text Perl_croak in util.o
not ok 3 - has data const symbols
# Failed test 3 - has data const symbols at t/porting/libperl.t line 322
not ok 4 - has PL_no_mem
# Failed test 4 - has PL_no_mem at t/porting/libperl.t line 323
# nocommon = 0
ok 5 - has PL_hash_seed_w
ok 6 - has PL_ppaddr
ok 7 - has undefined symbols
ok 8 - uses chmod (doio.o)
ok 9 - uses dlopen (DynaLoader.o)
ok 10 - uses exp (pp.o)
ok 11 - uses getenv (DynaLoader.o locale.o perl.o perlio.o regcomp.o toke.o util.o)
ok 12 - uses sigaction (mg.o util.o)
ok 13 - uses socket (doio.o)
ok 14 - uses time (perl.o pp_sys.o util.o)
ok 15 - uses no atoi ()
ok 16 - uses no atol ()
ok 17 - uses no atoll ()
ok 18 - uses no fgets ()
ok 19 - uses no gets ()
ok 20 - uses no sprintf ()
ok 21 - uses no strcat ()
ok 22 - uses no strcpy ()
ok 23 - uses no strncat ()
ok 24 - uses no strncpy ()
ok 25 - uses no strtol ()
ok 26 - uses no strtoq ()
ok 27 - uses no strtoul ()
ok 28 - uses no system ()
ok 29 - uses no tmpfile ()
ok 30 - uses no vsprintf ()
ok 31 - no S_ exports
1..31

Ubuntu is currently disabling the failing tests FWIW.


Flags

Configured by ntyni at Wed Nov 16 19:23:28 UTC 2022.

Summary of my perl5 (revision 5 version 37 subversion 6) configuration: Commit id: 52917b368fe204d0670f020d5f0f3ad9ec236e01 Platform: osname=linux osvers=5.19.0-2-amd64 archname=x86_64-linux uname='linux carme 5.19.0-2-amd64 #1 smp preempt_dynamic debian 5.19.11-1 (2022-09-24) x86_64 gnulinux ' config_args='-des -Dusedevel -Dccflags=-flto -Dldflags=-flto' hint=recommended useposix=true d_sigaction=define useithreads=undef usemultiplicity=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define Compiler: cc='cc' ccflags ='-flto -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' optimize='-O2' cppflags='-flto -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='12.2.0' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='cc' ldflags ='-flto -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64 libs=-lpthread -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc libc=/lib/x86_64-linux-gnu/libc.so.6 so=so useshrplib=false libperl=libperl.a gnulibc_version='2.36' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=so d_dlsymun=undef ccdlflags='-Wl,-E' cccdlflags='-fPIC' lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'


@INC for perl 5.37.6: lib /usr/local/lib/perl5/site_perl/5.37.6/x86_64-linux /usr/local/lib/perl5/site_perl/5.37.6 /usr/local/lib/perl5/5.37.6/x86_64-linux /usr/local/lib/perl5/5.37.6


Environment for perl 5.37.6: HOME=/home/ntyni LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH=/usr/lib/libeatmydata LOGDIR (unset) PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PERL_BADLANG (unset) SHELL=/bin/bash

ntyni commented 1 year ago

I haven't tested if this is GCC 12 specific. It was reported quite a while ago so it might well happen with earlier GCC versions too.

jkeenan commented 1 year ago

I was able to reproduce these failures on Linux (Debian 11) using gcc-10, the default C-compiler:

$ git describe
v5.37.5-171-gd47ed502d5
$ sh ./Configure -des -Dusedevel -Dccflags=-flto -Dldflags=-flto && make test_porting
...
$ cd t;./perl harness -v porting/libperl.t; cd -

ok 1 - has object util.o
ok 2 - has text Perl_croak in util.o
not ok 3 - has data const symbols
# Failed test 3 - has data const symbols at porting/libperl.t line 322
not ok 4 - has PL_no_mem
# Failed test 4 - has PL_no_mem at porting/libperl.t line 323
ok 5 - has PL_hash_seed_w
...
ok 31 - no S_ exports
Failed 2/31 subtests 

Test Summary Report
-------------------
porting/libperl.t (Wstat: 0 Tests: 31 Failed: 2)
  Failed tests:  3-4
Files=1, Tests=31,  0 wallclock secs ( 0.02 usr  0.00 sys +  0.05 cusr  0.05 csys =  0.12 CPU)
Result: FAIL

And, for what it's worth, I tried this with gcc10 as the C-compiler on FreeBSD-12. make failed on both blead and at v5.36.0. Example:

[perlmonger: perl] $ git describe
v5.37.5-171-gd47ed502d5
[perlmonger: perl] $ sh ./Configure -des -Dusedevel -Dccflags=-flto -Dldflags=-flto -Dcc=gcc10 && make test_porting
...
Updating 'mktables.lst'
./miniperl -Ilib -MExtUtils::Miniperl -e 'writemain(\"perlmain.c", @ARGV)' DynaLoader 
gcc10 -c -DPERL_CORE -flto -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_FORTIFY_SOURCE=2 -std=c99 -O2 -Wall -Werror=pointer-arith -Werror=vla -Wextra -Wno-long-long -Wno-declaration-after-statement -Wc++-compat -Wwrite-strings perlmain.c
gcc10 -o perl -flto -fstack-protector-strong -L/usr/local/lib  perlmain.o   libperl.a `cat ext.libs` -lpthread -ldl -lm -lcrypt -lutil -lc
/usr/local/bin/ld: /tmp//perl.Qqd7KG.ltrans0.ltrans.o: in function `xs_init':
<artificial>:(.text+0x6): undefined reference to `boot_DynaLoader'
/usr/local/bin/ld: <artificial>:(.text+0x10): undefined reference to `Perl_newXS'
/usr/local/bin/ld: /tmp//perl.Qqd7KG.ltrans0.ltrans.o: in function `main':
<artificial>:(.text.startup+0x23): undefined reference to `Perl_sys_init3'
/usr/local/bin/ld: <artificial>:(.text.startup+0x29): undefined reference to `PL_do_undump'
/usr/local/bin/ld: <artificial>:(.text.startup+0x31): undefined reference to `perl_alloc'
/usr/local/bin/ld: <artificial>:(.text.startup+0x49): undefined reference to `perl_construct'
/usr/local/bin/ld: <artificial>:(.text.startup+0x4f): undefined reference to `PL_perl_destruct_level'
/usr/local/bin/ld: <artificial>:(.text.startup+0x6c): undefined reference to `PL_exit_flags'
/usr/local/bin/ld: <artificial>:(.text.startup+0x72): undefined reference to `perl_parse'
/usr/local/bin/ld: <artificial>:(.text.startup+0x7d): undefined reference to `PL_sig_name'
/usr/local/bin/ld: <artificial>:(.text.startup+0x9a): undefined reference to `PL_sig_name'
/usr/local/bin/ld: <artificial>:(.text.startup+0xa5): undefined reference to `PL_sig_num'
/usr/local/bin/ld: <artificial>:(.text.startup+0xad): undefined reference to `Perl_rsignal_state'
/usr/local/bin/ld: <artificial>:(.text.startup+0xb4): undefined reference to `PL_csighandlerp'
/usr/local/bin/ld: <artificial>:(.text.startup+0xc0): undefined reference to `Perl_rsignal'
/usr/local/bin/ld: <artificial>:(.text.startup+0xd8): undefined reference to `perl_destruct'
/usr/local/bin/ld: <artificial>:(.text.startup+0xe7): undefined reference to `perl_free'
/usr/local/bin/ld: <artificial>:(.text.startup+0xec): undefined reference to `Perl_sys_term'
/usr/local/bin/ld: <artificial>:(.text.startup+0x100): undefined reference to `perl_run'
collect2: error: ld returned 1 exit status
*** Error code 1

Stop.
make: stopped in /usr/home/jkeenan/gitwork/perl

(I've never tried those 2 switches myself, so I don't know what to expect.)

jkeenan commented 1 year ago

This is a bug report for perl from ntyni@debian.org, generated with the help of perlbug 1.43 running under perl 5.37.6.

Description

As reported by Matthias Klose in https://bugs.debian.org/985884 t/porting/libperl.t fails when perl is built with link time optimization (LTO).

Steps to Reproduce

On current Debian unstable, with gcc (Debian 12.2.0-9) 12.2.0 and perl v5.37.5-162-g52917b368f


$ ./Configure -des -Dusedevel -Dccflags=-flto -Dldflags=-flto && make -j4 && make test

Can you describe, in lay person's terms, what -flto is intended to accomplish? I can't recall seeing that used in any of our smoke-testing, nor can I recall previous bug reports with that.

jkeenan commented 1 year ago

On Linux (Debian 11), using gcc-10 as the C-compiler, I checked out various tags and built with this configuration:

sh ./Configure -des -Dusedevel -Dccflags=-flto -Dldflags=-flto

As was the case with HEAD and v5.36.0, I got two failures in t/porting/libperl.t at each of v5.32.1 and v5.28.3. At v5.24.1 the build failed to complete.

So that leads me to ask: What evidence do you have that these configuration switches ever built and tested successfully?

gregoa commented 1 year ago

On Wed, 16 Nov 2022 15:26:22 -0800, James E Keenan wrote:

Can you describe, in lay person's terms, what -flto is intended to accomplish? I can't recall seeing that used in any of our smoke-testing, nor can I recall previous bug reports with that.

I'm sure that @ntyni can explain this in a better way, but for starters, the announcement at https://lists.debian.org/debian-devel/2022/06/msg00092.html and the wiki link there might be helpful.

Cheers, gregor

-- .''. https://info.comodo.priv.at -- Debian Developer https://www.debian.org : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06 . ' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe - BOFH excuse #11: magnetic interference from money/credit cards

iabyn commented 1 year ago

On Wed, Nov 16, 2022 at 12:17:18PM -0800, Niko Tyni wrote:

$ ./Configure -des -Dusedevel -Dccflags=-flto -Dldflags=-flto && make -j4 && make test [...] t/porting/libperl ................................................ # Failed test 3 - has data const symbols at porting/libperl.t line 322

Failed test 4 - has PL_no_mem at porting/libperl.t line 323

FAILED at test 3 [...] Failed 1 test out of 2502, 99.96% okay. porting/libperl.t not ok 3 - has data const symbols

Failed test 3 - has data const symbols at t/porting/libperl.t line 322

not ok 4 - has PL_no_mem

Failed test 4 - has PL_no_mem at t/porting/libperl.t line 323

nocommon = 0

The proximate cause of these two failures is a lack of readonly symbols in libperl.a.

In a normal build:

$ /usr/bin/nm  libperl.a | grep ' [Rr] '
0000000000000e00 r array_passed_to_stat
00000000000003e0 r bodies_by_type
00000000000087c0 r custom_op_register_vtbl
...

$ /usr/bin/nm  libperl.a  | grep PL_no_mem
                 U PL_no_mem
0000000000000850 R PL_no_mem
$

In a -flto build:

$ /usr/bin/nm  libperl.a | grep ' [Rr] '
$

$  /usr/bin/nm  libperl.a  | grep PL_no_mem
         U PL_no_mem
00000000 D PL_no_mem
$

(Note that PL_no_mem has changed from R to D.)

I don't know enough about LTO to know whether not having such symbols is to be expected and thus whether libperl.t should just skip those two tests when perl is configured with -flto.

-- More than any other time in history, mankind faces a crossroads. One path leads to despair and utter hopelessness. The other, to total extinction. Let us pray we have the wisdom to choose correctly. -- Woody Allen

jkeenan commented 1 year ago

On Wed, 16 Nov 2022 15:26:22 -0800, James E Keenan wrote: Can you describe, in lay person's terms, what -flto is intended to accomplish? I can't recall seeing that used in any of our smoke-testing, nor can I recall previous bug reports with that. I'm sure that @ntyni can explain this in a better way, but for starters, the announcement at https://lists.debian.org/debian-devel/2022/06/msg00092.html and the wiki link there might be helpful. Cheers, gregor

@gregoa, thank you very much for those URLs. Here is my summary/paraphrase of what they say:

Link time optimizations are an optimization that helps with a single digit percent number optimizing both for smaller size, and better speed. ... The proposal is to turn on LTO by default on most 64bit release architectures. In test rebuilds, there were 373 packages (dd-list in the wiki page) found not to build with link time optimizations for various reasons. [Perl is one of those with test failures.] The idea is to fix as many of these as possible, and then change the packaging for the others to just turn off LTO in the package build.

So this ticket is only incidentally a report of test failures. It's actually a request for a new functionality, i.e., build perl in a way not previously requested so that Debian (and presumably other distributions) can move in a new direction. (I note this because the original posting led me to believe this was a regression in perl; it's not.)

Until reading this ticket I had never heard of link-time optimizations, so I don't know how we should proceed. @ntyni, @gregoa, @jmdh, could one of you write up a post for the perl5-porters mailing list about Debian's move to LTO? That list has much wider visibility than these GH issues and it would be good for our readership to learn more about this new initiative.

Thank you very much. Jim Keenan

ntyni commented 1 year ago

So this ticket is only incidentally a report of test failures. It's actually a request for a new functionality, i.e., build perl in a way not previously requested so that Debian (and presumably other distributions) can move in a new direction. (I note this because the original posting led me to believe this was a regression in perl; it's not.)

Indeed. Apologies for the misunderstanding.

Until reading this ticket I had never heard of link-time optimizations, so I don't know how we should proceed. @ntyni, @gregoa, @jmdh, could one of you write up a post for the perl5-porters mailing list about Debian's move to LTO? That list has much wider visibility than these GH issues and it would be good for our readership to learn more about this new initiative.

Okay; mail sent.

Niko

Leont commented 1 year ago

I don't know enough about LTO to know whether not having such symbols is to be expected and thus whether libperl.t should just skip those two tests when perl is configured with -flto.

I think we should change the test. We should not make assumptions about in which section the compiler puts that variable.

jkeenan commented 1 year ago

I don't know enough about LTO to know whether not having such symbols is to be expected and thus whether libperl.t should just skip those two tests when perl is configured with -flto.

I think we should change the test. We should not make assumptions about in which section the compiler puts that variable.

Could you explain what these "sections" are? Also, would it be possible to get a p.r. for skipping the tests?

Leont commented 1 year ago

Could you explain what these "sections" are?

They're the different parts of the ELF binary format for executables and libraries. There is one section that is readonly and pre-initialized at compile time and the test expects a the variable to be put in there.

Apparently when using link-time optimization it shows up in the "initialized data section" (used for static writable variables) for the library, even if it ends up read-only in the resulting executable.

I don't know if this is a bug in nm or a side-effect of this decision being made at link time. In either case, we should either:

tonycoz commented 1 year ago

We should not make assumptions about in which section the compiler puts that variable.

I can see it being useful as a "does the toolchain work as we expect" test, but maybe not a test that we run in the released perl.

jkeenan commented 1 year ago

We should not make assumptions about in which section the compiler puts that variable.

I can see it being useful as a "does the toolchain work as we expect" test, but maybe not a test that we run in the released perl.

Do we currently have any porting tests in the core distribution which are for the purpose of verifying "that the toolchain works as we expect"? If not, how (and where) would we implement such a test?

tonycoz commented 1 year ago

libperl.t is the only one that I can see that specifically tests the build toolchain.

Several tests end up testing the underlying operating system, and have caused problems when the OS had bugs (the recent Solaris PERLIO=stdio bug, and DragonflyBSD error reporting for tty functions for example).

Leont commented 1 year ago

I can see it being useful as a "does the toolchain work as we expect" test, but maybe not a test that we run in the released perl.

I suspect the failing test is the sort of test that is much more likely to fail due to this sort of fragility than because of an actual toolchain failure. A lot of these things are much harder to check for reliably than one may think.