Closed debohman closed 1 year ago
% git describe
v3.12.0b1-1940-g77e9aae3837
Building Python 3.12.0 on the same system with the same toolchain results in the following:
0:00:24 load avg: 3.35 [16/44] test_embed
0:00:32 load avg: 3.17 [17/44] test_float -- test_embed failed (env changed)
I assume that this is not a fatal error?
We are running into this bug on the CPython benchmarking infrastructure, too. git bisect
says the first bad commit is 6ab6040054e5ca2d3eb7833dc8bf4eb0bbaa0aac.
Does test_decimal fail if you run it alone? How did you built Python?
Here is the result of running test_embed
by hand:
And test_decimal
:
% git describe
v3.12.0b1-2021-gea7b53ff677
To be clear: this is top of tree main
.
To answer the question about how I am building Python:
CC=clang CXX=clang++ CPPFLAGS=-I/usr/local/include LDFLAGS=-L/usr/local/lib ./configure --enable-optimizations --with-lto=full
It is not clear why _PyExc_ValueError
is not found when loading _opcode.cpython-313-darwin.so
. The symbol is present in python.exe
:
% nm -g build/lib.macosx-10.12-x86_64-3.13/_opcode.cpython-313-darwin.so | grep _PyExc_ValueError
U _PyExc_ValueError
nm -g python.exe | grep _PyExc_ValueError
0000000100581080 D _PyExc_ValueError
Something must have changed in the runtime module loading in main
.
SyntaxError: (unicode error) \N escapes not supported (can't load unicodedata module)
It's syntax that Python cannot locate the unicodedata shared library. Can you import it if you run Python manually? Was it built?
Can the issue be reproduced on other platforms than macOS?
SyntaxError: (unicode error) \N escapes not supported (can't load unicodedata module)
It's syntax that Python cannot locate the unicodedata shared library. Can you import it if you run Python manually? Was it built?
% python.exe
Python 3.13.0a0 (main, Oct 9 2023, 18:27:16) [Clang 17.0.1] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/local/lib/python313.zip', '/tera/tera/debo/Projects/Python/main/Lib', '/tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13']
>>> import unicodedata
>>> unicodedata.
unicodedata.UCD() unicodedata.digit( unicodedata.normalize(
unicodedata.bidirectional( unicodedata.east_asian_width( unicodedata.numeric(
unicodedata.category( unicodedata.is_normalized( unicodedata.ucd_3_2_0
unicodedata.combining( unicodedata.lookup( unicodedata.unidata_version
unicodedata.decimal( unicodedata.mirrored(
unicodedata.decomposition( unicodedata.name(
Can the issue be reproduced on other platforms than macOS?
Perhaps @mdboom can comment about what platform he is using in his comment above.
I see now that test_embed
uses Programs/_testembed
to execute. I don't know how embedding works, but clearly something has broken in 3.13
. It works properly in 3.12.0
built with the same toolchain.
Under 3.13
:
% Programs/_testembed test_init_main
Run Python code before _Py_InitializeMain
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: dlopen(/tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_testinternalcapi.cpython-313-darwin.so, 2): Symbol not found: _PyBaseObject_Type
Referenced from: /tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_testinternalcapi.cpython-313-darwin.so
Expected in: flat namespace
in /tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_testinternalcapi.cpython-313-darwin.so
Under 3.12.0
:
% Programs/_testembed test_init_main
Run Python code before _Py_InitializeMain
{"global_config": {"Py_FileSystemDefaultEncoding": "utf-8", "Py_HasFileSystemDefaultEncoding": 0, "Py_FileSystemDefaultEncodeErrors": "surrogateescape", "_Py_HasFileSystemDefaultEncodeErrors": 0, "Py_UTF8Mode": 0, "Py_DebugFlag": 0, "Py_VerboseFlag": 0, "Py_QuietFlag": 0, "Py_InteractiveFlag": 0, "Py_InspectFlag": 0, "Py_OptimizeFlag": 0, "Py_NoSiteFlag": 0, "Py_BytesWarningFlag": 0, "Py_FrozenFlag": 0, "Py_IgnoreEnvironmentFlag": 0, "Py_DontWriteBytecodeFlag": 0, "Py_NoUserSiteDirectory": 0, "Py_UnbufferedStdioFlag": 0, "Py_HashRandomizationFlag": 1, "Py_IsolatedFlag": 0}, "pre_config": {"_config_init": 2, "parse_argv": 1, "isolated": 0, "use_environment": 1, "configure_locale": 1, "coerce_c_locale": 0, "coerce_c_locale_warn": 0, "utf8_mode": 0, "dev_mode": 0, "allocator": 0}, "config": {"_config_init": 2, "isolated": 0, "use_environment": 1, "dev_mode": 0, "install_signal_handlers": 1, "use_hash_seed": 0, "hash_seed": 0, "faulthandler": 0, "tracemalloc": 0, "perf_profiling": 0, "import_time": 0, "code_debug_ranges": 1, "show_ref_count": 0, "dump_refs": 0, "malloc_stats": 0, "filesystem_encoding": "utf-8", "filesystem_errors": "surrogateescape", "pycache_prefix": null, "program_name": "./python3", "parse_argv": 2, "argv": ["-c", "arg2"], "xoptions": [], "warnoptions": [], "pythonpath_env": null, "home": null, "module_search_paths_set": 1, "module_search_paths": ["/usr/local/lib/python312.zip", "/tera/tera/debo/Projects/Python/v3.12.0/Lib", "/tera/tera/debo/Projects/Python/v3.12.0/build/lib.macosx-10.12-x86_64-3.12"], "stdlib_dir": "/tera/tera/debo/Projects/Python/v3.12.0/Lib", "executable": "/tera/tera/debo/Projects/Python/v3.12.0/python3", "base_executable": "/tera/tera/debo/Projects/Python/v3.12.0/python3", "prefix": "/usr/local", "base_prefix": "/usr/local", "exec_prefix": "/usr/local", "base_exec_prefix": "/usr/local", "platlibdir": "lib", "site_import": 1, "bytes_warning": 0, "warn_default_encoding": 0, "inspect": 0, "interactive": 0, "optimization_level": 0, "parser_debug": 0, "write_bytecode": 1, "verbose": 0, "quiet": 0, "user_site_directory": 1, "configure_c_stdio": 1, "buffered_stdio": 1, "stdio_encoding": "utf-8", "stdio_errors": "strict", "skip_source_first_line": 0, "run_command": "import _testinternalcapi, json; print(json.dumps(_testinternalcapi.get_configs()))\n", "run_module": null, "run_filename": null, "_install_importlib": 1, "check_hash_pycs_mode": "default", "pathconfig_warnings": 1, "_init_main": 0, "orig_argv": ["python3", "-c", "import _testinternalcapi, json; print(json.dumps(_testinternalcapi.get_configs()))", "arg2"], "use_frozen_modules": 1, "safe_path": 0, "_is_python_build": 1, "int_max_str_digits": 4300}}
I don't understand this test_embed error neither:
ImportError: dlopen(/tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_testinternalcapi.cpython-313-darwin.so, 2):
Symbol not found: _PyBaseObject_Type
In Include/object.h
, PyBaseObject_Type
variable is exported by:
PyAPI_DATA(PyTypeObject) PyBaseObject_Type; /* built-in 'object' */
Log:
% python.exe Lib/test/test_embed.py
.....................--- ['/tera/tera/debo/Projects/Python/main/Programs/_testembed', 'test_repeated_init_exec', 'import dis\nimport importlib._bootstrap\nimport opcode\nimport test.test_dis\n\ndef is_specialized(f):\n for instruction in dis.get_instructions(f, adaptive=True):\n opname = instruction.opname\n if (\n opname in opcode._specialized_opmap\n # Exclude superinstructions:\n and "__" not in opname\n ):\n return True\n return False\n\nfunc = importlib._bootstrap._handle_fromlist\n\n# "copy" the code to un-specialize it:\nfunc.__code__ = func.__code__.replace()\n\nassert not is_specialized(func), "specialized instructions found"\n\nfor i in range(test.test_dis.ADAPTIVE_WARMUP_DELAY):\n func(importlib._bootstrap, ["x"], lambda *args: None)\n\nassert is_specialized(func), "no specialized instructions found"\n\nprint("Tests passed")\n'] failed ---
stdout:
stderr:
--- Loop #1 ---
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tera/tera/debo/Projects/Python/main/Lib/dis.py", line 8, in <module>
from opcode import *
File "/tera/tera/debo/Projects/Python/main/Lib/opcode.py", line 12, in <module>
import _opcode
ImportError: dlopen(/tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_opcode.cpython-313-darwin.so, 2): Symbol not found: _PyExc_ValueError
Referenced from: /tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_opcode.cpython-313-darwin.so
Expected in: flat namespace
in /tera/tera/debo/Projects/Python/main/build/lib.macosx-10.12-x86_64-3.13/_opcode.cpython-313-darwin.so
Python/dynload_shlib.c
adds a lead underscore:
#if (defined(__OpenBSD__) || defined(__NetBSD__)) && !defined(__ELF__)
#define LEAD_UNDERSCORE "_"
#else
#define LEAD_UNDERSCORE ""
#endif
...
PyOS_snprintf(funcname, sizeof(funcname),
LEAD_UNDERSCORE "%.20s_%.200s", prefix, shortname);
cc @sobolevn @corona10: Can you reproduce these issues?
I think it is a problem with embedding on my platform in 3.13
, which I assume enables executing Python code from inside a a C program. As I have demonstrated, it works fine in 3.12.0
with the same exact toolchain.
I suspect that the problem occurs when Python code being executed in an embedded environment tries to import a runtime loaded module. That is why much of test_embed
fails.
It is not clear why _PyExc_ValueError is not found when loading _opcode.cpython-313-darwin.so. The symbol is present in python.exe
In the Python API, the variable is called PyExc_ValueError
without the leading underscore.
CC=clang CXX=clang++ CPPFLAGS=-I/usr/local/include LDFLAGS=-L/usr/local/lib ./configure --enable-optimizations --with-lto=full
Do you reproduce the issue with a more classic command?
CC=clang CXX=clang++ ./configure
then:
make && ./python.exe -m test -v test_embed
It is not clear why _PyExc_ValueError is not found when loading _opcode.cpython-313-darwin.so. The symbol is present in python.exe
In the Python API, the variable is called
PyExc_ValueError
without the leading underscore.
That is because macOS uses the traditional C symbol naming, which places an underscore before C symbols.
Aha, I can reproduce some issues on Linux: 8 tests failed with "env changed".
$ CC=clang ./configure --enable-optimizations
$ make
(...)
touch profile-gen-stamp
make[1]: Leaving directory '/home/vstinner/python/main'
# Next, run the profile task to generate the profile information.
LLVM_PROFILE_FILE="code-%p.profclangr" ./python -m test --pgo --timeout=
Using random seed 297553162
0:00:00 load avg: 6.89 Run 44 tests sequentially
0:00:00 load avg: 6.89 [ 1/44] test_array
0:00:00 load avg: 6.89 [ 2/44] test_base64
0:00:00 load avg: 6.89 [ 3/44] test_binascii -- test_base64 failed (env changed)
0:00:00 load avg: 6.89 [ 4/44] test_binop
0:00:00 load avg: 6.89 [ 5/44] test_bisect
0:00:00 load avg: 6.89 [ 6/44] test_bytes
0:00:03 load avg: 6.89 [ 7/44] test_bz2 -- test_bytes failed (env changed)
0:00:04 load avg: 6.89 [ 8/44] test_cmath
0:00:04 load avg: 6.89 [ 9/44] test_codecs
0:00:05 load avg: 6.42 [10/44] test_collections
0:00:05 load avg: 6.42 [11/44] test_complex
0:00:05 load avg: 6.42 [12/44] test_dataclasses
0:00:06 load avg: 6.42 [13/44] test_datetime
0:00:09 load avg: 5.91 [14/44] test_decimal
0:00:12 load avg: 5.91 [15/44] test_difflib
0:00:13 load avg: 5.91 [16/44] test_embed
0:00:17 load avg: 5.51 [17/44] test_float -- test_embed failed (env changed)
0:00:17 load avg: 5.51 [18/44] test_fstring
0:00:18 load avg: 5.51 [19/44] test_functools
0:00:18 load avg: 5.51 [20/44] test_generators
0:00:18 load avg: 5.51 [21/44] test_hashlib
0:00:19 load avg: 5.51 [22/44] test_heapq
0:00:19 load avg: 5.15 [23/44] test_int
0:00:20 load avg: 5.15 [24/44] test_itertools
0:00:22 load avg: 5.15 [25/44] test_json
0:00:23 load avg: 5.15 [26/44] test_long -- test_json failed (env changed)
0:00:25 load avg: 4.82 [27/44] test_lzma
0:00:25 load avg: 4.82 [28/44] test_math
0:00:26 load avg: 4.82 [29/44] test_memoryview
0:00:27 load avg: 4.82 [30/44] test_operator
0:00:27 load avg: 4.82 [31/44] test_ordered_dict
0:00:28 load avg: 4.82 [32/44] test_patma
0:00:28 load avg: 4.82 [33/44] test_pickle
0:00:31 load avg: 4.51 [34/44] test_pprint
0:00:31 load avg: 4.51 [35/44] test_re
0:00:32 load avg: 4.51 [36/44] test_set
0:00:35 load avg: 4.23 [37/44] test_sqlite3
0:00:35 load avg: 4.23 [38/44] test_statistics -- test_sqlite3 failed (env changed)
0:00:37 load avg: 4.23 [39/44] test_str
0:00:38 load avg: 4.23 [40/44] test_struct -- test_str failed (env changed)
0:00:39 load avg: 4.23 [41/44] test_tabnanny -- test_struct failed (env changed)
0:00:39 load avg: 3.97 [42/44] test_time -- test_tabnanny failed (env changed)
0:00:41 load avg: 3.97 [43/44] test_xml_etree
0:00:42 load avg: 3.97 [44/44] test_xml_etree_c
Total duration: 43.1 sec
Total tests: run=8,990 skipped=187
Total test files: run=44/44 env_changed=8
Result: SUCCESS
Example:
vstinner@mona$ LLVM_PROFILE_FILE=code-%p.profclangr ./python -m test --fail-env-changed test_bytes -m test_check_encoding_errors -v
== CPython 3.13.0a0 (heads/main:732532b0af, Oct 11 2023, 00:54:44) [Clang 16.0.6 (Fedora 16.0.6-3.fc38)]
== Linux-6.5.5-200.fc38.x86_64-x86_64-with-glibc2.37 little-endian
== Python build: release
== cwd: /home/vstinner/python/main/build/test_python_worker_31170æ
== CPU count: 12
== encodings: locale=UTF-8 FS=utf-8
== resources: all test resources are disabled, use -u option to unskip tests
Using random seed 4098483406
0:00:00 load avg: 1.57 Run 1 test sequentially
0:00:00 load avg: 1.57 [1/1] test_bytes
test_check_encoding_errors (test.test_bytes.ByteArrayTest.test_check_encoding_errors) ... ok
test_check_encoding_errors (test.test_bytes.BytesTest.test_check_encoding_errors) ... ok
----------------------------------------------------------------------
Ran 2 tests in 0.053s
OK
Warning -- files was modified by test_bytes
Warning -- Before: []
Warning -- After: ['code-31172.profclangr', 'code-31173.profclangr', 'code-31174.profclangr']
test_bytes failed (env changed)
== Tests result: ENV CHANGED ==
1 test altered the execution environment (env changed):
test_bytes
Total duration: 97 ms
Total tests: run=2 (filtered)
Total test files: run=1/1 (filtered) env_changed=1
Result: ENV CHANGED
Each subprocess spawned by tests create a code-<pid>.profclangr
file.
Okay, here is what I have determined so far: Programs/_testembed
has no symbols in 3.13
! That is why the runtime loading is failing. I am trying to figure out what is different in the build environment for Programs/_testembed
between 3.12
and 3.13
.
Tests spawning subprocesses create many code-<pid>.profclangr
files in the current directory. It's unclear to me if they should be used to train the PGO build or not.
PR #110654 avoids the warning, but .profclangr
created by subprocesses are still removed, since regrtest starts by creating a temporary directory and changes the current working directory for this one.
If you want to to bisect the test_embed regression, you can use ./configure PROFILE_TASK="-m test --pgo test_embed"
command to only run test_embed. Oh. But before, failures of the profile task were ignored by || true
:
$(LLVM_PROF_FILE) $(RUNSHARED) ./$(BUILDPYTHON) $(PROFILE_TASK) || true
I found the problem. The offending commit is 3e3a7da590e1c3e5f03802e538f26c5204889c82. After reverting that and rebuilding from scratch:
Can you reproduce the issue without PGO, just with LTO? In short, does the following command reproduce your issue?
CC=clang ./configure --with-lto=full && make && ./python -m test test_embed -v
On Fedora 38 with clang 16 (clang version 16.0.6 (Fedora 16.0.6-3.fc38)
), I get:
$ CC=clang LD=clang ./configure --with-lto=full
$ grep NOLTO= Makefile
CONFIGURE_LDFLAGS_NOLTO=-flto=thin
PY_LDFLAGS_NOLTO=$(PY_LDFLAGS) $(CONFIGURE_LDFLAGS_NOLTO) $(LDFLAGS_NODIST)
$ make
$ ./Programs/_testembed test_audit; echo $?
0
$ ./python -m test test_embed
(...)
Total duration: 2.9 sec
Total tests: run=69 skipped=4
Total test files: run=1/1
Result: SUCCESS
The NOLTO value comes from configure.ac
:
if test "$Py_LTO" = 'true' ; then
case $CC in
*clang*)
LDFLAGS_NOLTO="-fno-lto"
dnl Clang linker requires -flto in order to link objects with LTO information.
dnl Thin LTO is faster and works for object files with full LTO information, too.
AX_CHECK_COMPILE_FLAG([-flto=thin],[LDFLAGS_NOLTO="-flto=thin"],[LDFLAGS_NOLTO="-flto"])
@mdboom:
We are running into this bug on the CPython benchmarking infrastructure, too. git bisect says the first bad commit is https://github.com/python/cpython/commit/6ab6040054e5ca2d3eb7833dc8bf4eb0bbaa0aac.
What is your macOS version? What is your LLVM clang version?
I found the problem. The offending commit is https://github.com/python/cpython/commit/3e3a7da590e1c3e5f03802e538f26c5204889c82.
Great bisection, that's very helpful, thanks a lot!
Can you reproduce the issue without PGO, just with LTO? In short, does the following command reproduce your issue?
I was testing with LTO and without PGO, now I am rebuilding with LTO and PGO.
I was testing with LTO and without PGO
Ok, so the issue only comes from LTO=full.
It seems to be the embedding that is somehow tripped up by this.
I don't think we have established that LLVM 17 is the problem. To do that we would need to try building with LLVM 16.
Building with PGO and LTO are fine now.
Test on Fedora Rawhide with LLVM clang 17:
$ CC=clang LD=clang ./configure --with-lto=full
$ make -j10
$ ./Programs/_testembed test_audit; echo $?
0
$ ./python -m test test_embed
(...)
Total duration: 4.7 sec
Total tests: run=69 skipped=4
Total test files: run=1/1
Result: SUCCESS
It works as expected. By the way, PGO+LTO build is tested on Fedora Rawhide on multiple architectures:
Building with PGO and LTO are fine now.
Would you mind to elaborate "now"? Before you wrote that it didn't work. What changed? Are you still testing the latest main branch?
You're doing tests on macOS 10.12.6 with Clang 17.0.1, right?
I don't think we have established that LLVM 17 is the problem.
I added "on macOS" to the issue title :-)
Building with PGO and LTO are fine now.
Would you mind to elaborate "now"? Before you wrote that it didn't work. What changed? Are you still testing the latest main branch?
You're doing tests on macOS 10.12.6 with Clang 17.0.1, right?
Sorry, with https://github.com/python/cpython/commit/3e3a7da590e1c3e5f03802e538f26c5204889c82 reverted.
To do that we would need to try building with LLVM 16.
If you can run your test with LLVM 16 on macOS, that would be nice yes, since I fail to reproduce the issue on Linux with LLVM 17.
To do that we would need to try building with LLVM 16.
If you can run your test with LLVM 16 on macOS, that would be nice yes, since I fail to reproduce the issue on Linux with LLVM 17.
Yes, I will try that, but it will have to be later. It is past dinnertime now.
The issue exists when building with LLVM 16, and backing out commit 3e3a7da resolves it.
@corona10: Should we just give up on the "NOLTO" idea, building some object files with LTO (full) or then attempt to use a different LTO mode (thin) to link them? Nor only revert https://github.com/python/cpython/commit/3e3a7da590e1c3e5f03802e538f26c5204889c82 but also remove NOLTO variables.
This is on macOS 10.12.6, building with LLVM / Clang 17.0.1. The system otherwise has Python 3.11.6 installed.
@debohman Are you using a custom compiler toolchain rather than using the Apple basic toolchain?
Nor only revert https://github.com/python/cpython/commit/3e3a7da590e1c3e5f03802e538f26c5204889c82 but also remove NOLTO variables.
I am okay with reverting the PR, IIUC it was just intended to speed up the build, but I need to know what effect will occur if drop the "NOLTO" idea.
If we drop the "NOLTO" idea, it will occur https://github.com/python/cpython/issues/96761 again. Please let me know If I misunderstand.
Or do you intend to revert https://github.com/python/cpython/pull/29859 too?
This is on macOS 10.12.6, building with LLVM / Clang 17.0.1. The system otherwise has Python 3.11.6 installed.
@debohman Are you using a custom compiler toolchain rather than using the Apple basic toolchain?
The stock linker is being used, the rest of the toolchain is the newer llvm / clang.
What is your macOS version? What is your LLVM clang version?
It's macOS 13.6
% clang --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
% uname -a
Darwin CPythons-Mac-mini.local 22.6.0 Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103 arm64
I'm building CPython as follows:
./configure --with-openssl=$(brew --prefix openssl) --enable-optimizations --with-lto
make -j
Bug report
Bug description:
This is on macOS 10.12.6, building with LLVM / Clang 17.0.1. The system otherwise has Python 3.11.6 installed.
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Linked PRs