llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.13k stars 12.01k forks source link

BOLT crashes with `--update-debug-sections` on DWARF v5 when optimizing libpython.so #67966

Open indygreg opened 1 year ago

indygreg commented 1 year ago

I can reliably elicit a crash out of BOLT when optimizing an x86-64 ELF binary with DWARF v5. The crash appears identical to #56277:

cpython-3.12> BOLT-INFO: FRAME ANALYSIS: 36855 function(s) were not optimized.
cpython-3.12> BOLT-INFO: FRAME ANALYSIS: 651 function(s) (51.5% dyn cov) could not have its frame indices restored.
cpython-3.12> BOLT-INFO: Shrink wrapping moved 45 spills inserting load/stores and 5 spills inserting push/pops
cpython-3.12> BOLT-INFO: Shrink wrapping reduced 5692310 store executions (0.1% total instructions executed, 1.3% store instructions)
cpython-3.12> BOLT-INFO: Shrink wrapping failed at reducing 0 store executions (0.0% total instructions executed, 0.0% store instructions)
cpython-3.12> BOLT-INFO: Allocation combiner: 38 empty spaces coalesced (dyn count: 3197588).
cpython-3.12> #0 0x0000556536b4d188 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/tools/llvm/bin/llvm-bolt+0x16b7188)
cpython-3.12>  #1 0x0000556536b4b23c llvm::sys::RunSignalHandlers() (/tools/llvm/bin/llvm-bolt+0x16b523c)
cpython-3.12>  #2 0x0000556536b4d91d SignalHandler(int) Signals.cpp:0:0
cpython-3.12>  #3 0x00007f0a984dc890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0xf890)
cpython-3.12>  #4 0x000055653712a0fb llvm::bolt::BinaryContext::addDebugFilenameToUnit(unsigned int, unsigned int, unsigned int) (/tools/llvm/bin/llvm-bolt+0x1c940fb)
cpython-3.12>  #5 0x0000556537139d3e (anonymous namespace)::BinaryEmitter::emitFunctionBody(llvm::bolt::BinaryFunction&, llvm::bolt::FunctionFragment&, bool) BinaryEmitter.cpp:0:0
cpython-3.12>  #6 0x000055653713ac9b (anonymous namespace)::BinaryEmitter::emitFunction(llvm::bolt::BinaryFunction&, llvm::bolt::FunctionFragment&) BinaryEmitter.cpp:0:0
cpython-3.12>  #7 0x000055653713a677 (anonymous namespace)::BinaryEmitter::emitFunctions()::$_0::operator()(std::vector<llvm::bolt::BinaryFunction*, std::allocator<llvm::bolt::BinaryFunction*>> const&) const BinaryEmitter.cpp:0:0
cpython-3.12>  #8 0x0000556537138d41 llvm::bolt::emitBinaryContext(llvm::MCStreamer&, llvm::bolt::BinaryContext&, llvm::StringRef) (/tools/llvm/bin/llvm-bolt+0x1ca2d41)
cpython-3.12>  #9 0x0000556536b9ce82 llvm::bolt::RewriteInstance::emitAndLink() (/tools/llvm/bin/llvm-bolt+0x1706e82)
cpython-3.12> #10 0x0000556536b94af3 llvm::bolt::RewriteInstance::run() (/tools/llvm/bin/llvm-bolt+0x16feaf3)
cpython-3.12> #11 0x0000556535797032 main (/tools/llvm/bin/llvm-bolt+0x301032)
cpython-3.12> #12 0x00007f0a971a7b45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b45)
cpython-3.12> #13 0x00005565357952db _start (/tools/llvm/bin/llvm-bolt+0x2ff2db)
cpython-3.12> PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
cpython-3.12> Stack dump:
cpython-3.12> 0.    Program arguments: /tools/llvm/bin/llvm-bolt libpython3.12.so.1.0.prebolt -o libpython3.12.so.1.0.bolt -data=libpython3.12.so.1.0.fdata -update-debug-sections -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -icf=1 -inline-all -split-eh -reorder-functions-use-hot-size -peepholes=none -jump-tables=aggressive -inline-ap -indirect-call-promotion=all -dyno-stats -use-gnu-stack -frame-opt=hot
cpython-3.12> Segmentation fault (core dumped)
cpython-3.12> make[1]: *** [profile-bolt-stamp] Error 139
cpython-3.12> Makefile:789: recipe for target 'profile-bolt-stamp' failed
cpython-3.12> make[1]: Leaving directory '/build/Python-3.12.0rc3'

I can crash at least 16.0.3 and 17.0.1.

The crash goes away if I compile all inputs with -fdebug-default-version=4. So it appears it has something to do with updating DWARF v5 debug symbols.

Steps to reproduce (sorry for not being minimal):

  1. git clone https://github.com/indygreg/python-build-standalone
  2. git checkout fa5d1fcebdde07aa04b6c661a55c77c85f414508 (https://github.com/indygreg/python-build-standalone/commit/fa5d1fcebdde07aa04b6c661a55c77c85f414508 from the bolt-crash branch)
  3. ./build-linux.py --optimizations pgo --python cpython-3.12 --break-on-failure

This builds CPython and its dependencies from source inside highly deterministic Docker containers. After a few minutes it should eventually get to BOLT optimizations and crash. That --break-on-failure keeps the container from exiting on failure, giving you the opportunity to docker exec into it to debug.

CI logs showing the crash should appear at https://github.com/indygreg/python-build-standalone/actions/runs/6379047670 within a few hours.

If you tell me how to figure out which object file / symbol it is crashing on, I can dump DWARF of the offending file / symbol if you aren't able to reproduce.

aaupov commented 1 year ago

CC @ayermolo

ayermolo commented 1 year ago

To clarify it doesn't crash in trunk? Just on version 16 and 17 of llvm?

indygreg commented 1 year ago

I didn’t try trunk.

Kepontry commented 1 year ago

Hi, I ran into some error when building cpython when using the command ./build-linux.py --optimizations pgo --python cpython-3.12 --break-on-failure --serial.

/home/zhoujiapeng/python-build-standalone/build/downloads/pip-23.2.1-py3-none-any.whl exists and passes integrity checks
cpython-3.12> disabling extension module _gdbm because disabled for this target triple
cpython-3.12> ignoring extension module _peg_parser because Python version incompatible
cpython-3.12> disabling extension module _scproxy because disabled for this target triple
cpython-3.12> ignoring extension module _sha256 because Python version incompatible
cpython-3.12> ignoring extension module _sha512 because Python version incompatible
cpython-3.12> disabling extension module _testcapi because disabled for this target triple
cpython-3.12> disabling extension module nis because disabled for this target triple
cpython-3.12> ignoring extension module parser because Python version incompatible
cpython-3.12> disabling extension module xx because disabled for this target triple
cpython-3.12> disabling extension module xxlimited because disabled for this target triple
cpython-3.12> disabling extension module xxlimited_35 because disabled for this target triple
Traceback (most recent call last):
  File "/home/zhoujiapeng/python-build-standalone/cpython-unix/build.py", line 1189, in <module>
    sys.exit(main())
  File "/home/zhoujiapeng/python-build-standalone/cpython-unix/build.py", line 1172, in main
    build_cpython(
  File "/home/zhoujiapeng/python-build-standalone/cpython-unix/build.py", line 699, in build_cpython
    setup = derive_setup_local(
  File "/home/zhoujiapeng/python-build-standalone/pythonbuild/cpython.py", line 288, in derive_setup_local
    with tarfile.open(str(cpython_source_archive)) as tf:
  File "/usr/local/lib/python3.8/tarfile.py", line 1788, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
make: *** [Makefile:311: /home/zhoujiapeng/python-build-standalone/build/cpython-3.12.0rc3-x86_64-unknown-linux-gnu-pgo.tar] Error 1

I'm not sure whether it is the build result of cpython-host("cpython-3.12-3.12.0rc3-linux64.tar"), so I renamed it to "cpython-3.12.0rc3-x86_64-unknown-linux-gnu-pgo.tar" and got some make errors.

[notice] A new release of pip is available: 23.0.1 -> 23.2.1
[notice] To update, run: /home/zhoujiapeng/python-build-standalone/build/venv.linux/bin/python3 -m pip install --upgrade pip
make: Nothing to be done for 'default'.
compressing Python archive to /home/zhoujiapeng/python-build-standalone/dist/cpython-3.12.0rc3-x86_64-unknown-linux-gnu-pgo-20231002T1838.tar.zst

btw, according to the failed log of CI, the object file should be "libpython3.12.so.1.0".

What's more, both pgo and bolt instrumentation failed some tests, which should be fixed.

PGO

cpython-3.12> 0:00:24 load avg: 1.58 [23/44] test_int
cpython-3.12> test test_int failed
cpython-3.12> 0:00:25 load avg: 1.58 [24/44] test_itertools -- test_int failed (2 errors)

BOLT

cpython-3.12> 0:00:41 load avg: 1.44 [16/44] test_embed
cpython-3.12> Fatal Python error: Segmentation fault
ayermolo commented 1 year ago

I didn’t try trunk.

Can you try it?

ayermolo commented 1 year ago

Hi, I ran into some error when building cpython when using the command ./build-linux.py --optimizations pgo --python cpython-3.12 --break-on-failure --serial.

/home/zhoujiapeng/python-build-standalone/build/downloads/pip-23.2.1-py3-none-any.whl exists and passes integrity checks
cpython-3.12> disabling extension module _gdbm because disabled for this target triple
cpython-3.12> ignoring extension module _peg_parser because Python version incompatible
cpython-3.12> disabling extension module _scproxy because disabled for this target triple
cpython-3.12> ignoring extension module _sha256 because Python version incompatible
cpython-3.12> ignoring extension module _sha512 because Python version incompatible
cpython-3.12> disabling extension module _testcapi because disabled for this target triple
cpython-3.12> disabling extension module nis because disabled for this target triple
cpython-3.12> ignoring extension module parser because Python version incompatible
cpython-3.12> disabling extension module xx because disabled for this target triple
cpython-3.12> disabling extension module xxlimited because disabled for this target triple
cpython-3.12> disabling extension module xxlimited_35 because disabled for this target triple
Traceback (most recent call last):
  File "/home/zhoujiapeng/python-build-standalone/cpython-unix/build.py", line 1189, in <module>
    sys.exit(main())
  File "/home/zhoujiapeng/python-build-standalone/cpython-unix/build.py", line 1172, in main
    build_cpython(
  File "/home/zhoujiapeng/python-build-standalone/cpython-unix/build.py", line 699, in build_cpython
    setup = derive_setup_local(
  File "/home/zhoujiapeng/python-build-standalone/pythonbuild/cpython.py", line 288, in derive_setup_local
    with tarfile.open(str(cpython_source_archive)) as tf:
  File "/usr/local/lib/python3.8/tarfile.py", line 1788, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
make: *** [Makefile:311: /home/zhoujiapeng/python-build-standalone/build/cpython-3.12.0rc3-x86_64-unknown-linux-gnu-pgo.tar] Error 1

I'm not sure whether it is the build result of cpython-host("cpython-3.12-3.12.0rc3-linux64.tar"), so I renamed it to "cpython-3.12.0rc3-x86_64-unknown-linux-gnu-pgo.tar" and got some make errors.

[notice] A new release of pip is available: 23.0.1 -> 23.2.1
[notice] To update, run: /home/zhoujiapeng/python-build-standalone/build/venv.linux/bin/python3 -m pip install --upgrade pip
make: Nothing to be done for 'default'.
compressing Python archive to /home/zhoujiapeng/python-build-standalone/dist/cpython-3.12.0rc3-x86_64-unknown-linux-gnu-pgo-20231002T1838.tar.zst

btw, according to the failed log of CI, the object file should be "libpython3.12.so.1.0".

What's more, both pgo and bolt instrumentation failed some tests, which should be fixed.

PGO

cpython-3.12> 0:00:24 load avg: 1.58 [23/44] test_int
cpython-3.12> test test_int failed
cpython-3.12> 0:00:25 load avg: 1.58 [24/44] test_itertools -- test_int failed (2 errors)

BOLT

cpython-3.12> 0:00:41 load avg: 1.44 [16/44] test_embed
cpython-3.12> Fatal Python error: Segmentation fault

This looks like a different issue. Please file a new issue.

indygreg commented 1 year ago

I'm not sure why you are seeing that tarfile.ReadError: file could not be opened successfully error. That appears to be a bug in my python-build-standalone project. Please consider filing it against indygreg/python-build-standalone. I suspect it may magically go away if you run with Python 3.10 or 3.11.

The manual copying of the .tar file likely contributed to the additional errors you saw.

I recommend deleting all files with python in them from build/ and build/downloads/ and then try running the original build-linux.py command again.