python / cpython

The Python programming language
https://www.python.org
Other
63.42k stars 30.37k forks source link

Link-time-optimization with clang is broken #96761

Closed matthiasgoergens closed 2 years ago

matthiasgoergens commented 2 years ago

I tried to enable link time optimizations for clang.

export CC="clang"
export LD="clang"

nice "${here}/configure" \
    --enable-optimizations \
    --with-lto=thin

nice make -j8

Eventually building reaches this step:

clang -fprofile-instr-generate -o _bootstrap_python Modules/getbuildinfo.o Parser/token.o Parser/pegen.o Parser/pegen_errors.o Parser/action_helpers.o Parser/parser.o Parser/string_parser.o Parser/peg_api.o Parser/myreadline.o Parser/tokenizer.o Objects/abstract.o Objects/boolobject.o Objects/bytes_methods.o Objects/bytearrayobject.o Objects/bytesobject.o Objects/call.o Objects/capsule.o Objects/cellobject.o Objects/classobject.o Objects/codeobject.o Objects/complexobject.o Objects/descrobject.o Objects/enumobject.o Objects/exceptions.o Objects/genericaliasobject.o Objects/genobject.o Objects/fileobject.o Objects/floatobject.o Objects/frameobject.o Objects/funcobject.o Objects/interpreteridobject.o Objects/iterobject.o Objects/listobject.o Objects/longobject.o Objects/dictobject.o Objects/odictobject.o Objects/memoryobject.o Objects/methodobject.o Objects/moduleobject.o Objects/namespaceobject.o Objects/object.o Objects/obmalloc.o Objects/picklebufobject.o Objects/rangeobject.o Objects/setobject.o Objects/sliceobject.o Objects/structseq.o Objects/tupleobject.o Objects/typeobject.o Objects/unicodeobject.o Objects/unicodectype.o Objects/unionobject.o Objects/weakrefobject.o Objects/perf_trampoline.o Objects/asm_trampoline.o Python/_warnings.o Python/Python-ast.o Python/Python-tokenize.o Python/asdl.o Python/ast.o Python/ast_opt.o Python/ast_unparse.o Python/bltinmodule.o Python/ceval.o Python/codecs.o Python/compile.o Python/context.o Python/dynamic_annotations.o Python/errors.o Python/frame.o Python/frozenmain.o Python/future.o Python/getargs.o Python/getcompiler.o Python/getcopyright.o Python/getplatform.o Python/getversion.o Python/ceval_gil.o Python/hamt.o Python/hashtable.o Python/import.o Python/importdl.o Python/initconfig.o Python/marshal.o Python/modsupport.o Python/mysnprintf.o Python/mystrtoul.o Python/pathconfig.o Python/preconfig.o Python/pyarena.o Python/pyctype.o Python/pyfpe.o Python/pyhash.o Python/pylifecycle.o Python/pymath.o Python/pystate.o Python/pythonrun.o Python/pytime.o Python/bootstrap_hash.o Python/specialize.o Python/structmember.o Python/symtable.o Python/sysmodule.o Python/thread.o Python/traceback.o Python/getopt.o Python/pystrcmp.o Python/pystrtod.o Python/pystrhex.o Python/dtoa.o Python/formatter_unicode.o Python/fileutils.o Python/suggestions.o Python/dynload_shlib.o Modules/config.o Modules/main.o Modules/gcmodule.o Modules/atexitmodule.o Modules/faulthandler.o Modules/posixmodule.o Modules/signalmodule.o Modules/_tracemalloc.o Modules/_codecsmodule.o Modules/_collectionsmodule.o Modules/errnomodule.o Modules/_io/_iomodule.o Modules/_io/iobase.o Modules/_io/fileio.o Modules/_io/bytesio.o Modules/_io/bufferedio.o Modules/_io/textio.o Modules/_io/stringio.o Modules/itertoolsmodule.o Modules/_sre/sre.o Modules/_threadmodule.o Modules/timemodule.o Modules/_weakref.o Modules/_abc.o Modules/_functoolsmodule.o Modules/_localemodule.o Modules/_operator.o Modules/_stat.o Modules/symtablemodule.o Modules/pwdmodule.o Programs/_bootstrap_python.o Modules/getpath.o -ldl -lm

And fails:

Modules/getbuildinfo.o: file not recognized: file format not recognized
clang-14: error: linker command failed with exit code 1 (use -v to see invocation)

Your environment

I'm on Archlinux x86-64.

$ clang --version
clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

I tested against main.

corona10 commented 2 years ago

-no-lto flag was first introduced at https://github.com/python/cpython/pull/29859 by @tiran Would you like to take a look at this issue please?

matthiasgoergens commented 2 years ago

@corona10 Thanks! That seems to be mostly about speeding up the GCC build. LLVM is already pretty fast with thin-lto.

corona10 commented 2 years ago

I will try to reproduce the issue in my Linux environment with clang14 before reviewing the PR. cc @tiran

corona10 commented 2 years ago

Okay, I am able to reproduce the issue even with the lld.

matthiasgoergens commented 2 years ago

Thanks for confirming that it's not just some idiosyncratic weirdness of my environment.

corona10 commented 2 years ago

For @matthiasgoergens, @tiran @vstinner cc @pablogsal as release manager.

My conclusion

Reference

file format

corona10@python-dev:~/cpython$ file  Modules/getbuildinfo.o
Modules/getbuildinfo.o: LLVM IR bitcode

clang full LTO process

image

clang ThinLTO process

image

pablogsal commented 2 years ago

Why is this only affecting 3.11 and not previous version?

corona10 commented 2 years ago

Why is this only affecting 3.11 and not previous version?

because https://github.com/python/cpython/pull/29859 was first introduced in Python 3.11 for _bootstrap_python.

corona10 commented 2 years ago

When building the final binary for CPython, we actually pass the -flto flags that the process will transform the LLVM bitcode into object files, so this issue only affects while building the _bootstrap_python. I didn't check if the issue will affect to old versions (< 3.10)

corona10 commented 2 years ago

Manual transformation example:

corona10@python-dev:~/cpython$ llc-14 -filetype=obj Modules/getbuildinfo.o -o Modules/getbuildinfo.o
corona10@python-dev:~/cpython$ file Modules/getbuildinfo.o
Modules/getbuildinfo.o: ELF 64-bit LSB relocatable, x86-64, version 1 (GNU/Linux), with debug_info, not stripped
pablogsal commented 2 years ago

Thanks for the quick answer and great analysis @corona10. I think the fix for this should go to 3.11.1 as this likely involve configure changes unless someone fundamentally disagrees.

matthiasgoergens commented 2 years ago

As far as I can tell, building _bootstrap_python with different flags than the full Python binary was only a hack to speed up the build, wasn't it?

Thin LTO is pretty fast, and compatible with the other LTO settings (apart perhaps from no-LTO), so we can build the bootstrap with that one, if any LTO is requested for the main build?

corona10 commented 2 years ago

Thin LTO is pretty fast, and compatible with the other LTO settings (apart perhaps from no-LTO), so we can build the bootstrap with that one, if any LTO is requested for the main build?

Please follow my last review: https://github.com/python/cpython/pull/96762#discussion_r973207014 Since we should support all C11 compilers including old clangs which doesn't support ThinLTO, the configuration should use ThinLTO only if the compiler supports it. We should not break the build for old compilers too. see: https://clang.llvm.org/c_status.html and https://peps.python.org/pep-0007/#c-dialect

matthiasgoergens commented 2 years ago

@corona10 Yes, definitely. I need to add checks.

corona10 commented 2 years ago

All patches are merged. Feel free to reopen if we need other solutions. cc @tiran @pablogsal