python / cpython

The Python programming language
https://www.python.org
Other
63.89k stars 30.58k forks source link

test_embed fails if Python is built with LTO and LLVM clang on macOS #110313

Closed debohman closed 1 year ago

debohman commented 1 year ago

Bug report

Bug description:

# Next, run the profile task to generate the profile information.
LLVM_PROFILE_FILE="code-%p.profclangr"  ./python.exe -m test --pgo --timeout=
Raised RLIMIT_NOFILE: 256 -> 1024
0:00:00 load avg: 3.49 Run 44 tests sequentially
0:00:00 load avg: 3.49 [ 1/44] test_array
0:00:01 load avg: 3.49 [ 2/44] test_base64
0:00:01 load avg: 3.49 [ 3/44] test_binascii -- test_base64 failed (env changed)
0:00:01 load avg: 3.49 [ 4/44] test_binop
0:00:01 load avg: 3.49 [ 5/44] test_bisect
0:00:01 load avg: 3.49 [ 6/44] test_bytes
0:00:05 load avg: 3.37 [ 7/44] test_bz2 -- test_bytes failed (env changed)
0:00:07 load avg: 3.37 [ 8/44] test_cmath
0:00:07 load avg: 3.34 [ 9/44] test_codecs
0:00:09 load avg: 3.34 [10/44] test_collections
0:00:10 load avg: 3.34 [11/44] test_complex
0:00:10 load avg: 3.34 [12/44] test_dataclasses
0:00:11 load avg: 3.34 [13/44] test_datetime
0:00:16 load avg: 3.39 [14/44] test_decimal
-------------------------------------------------------- NOTICE --------------------------------------------------------
test_decimal may generate "malloc can't allocate region"
warnings on macOS systems. This behavior is known. Do not
report a bug unless tests are also failing.
See https://github.com/python/cpython/issues/85100
------------------------------------------------------------------------------------------------------------------------
./Modules/_decimal/libmpdec/context.c:57: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time

python.exe(67804,0x7fffa28f43c0) malloc: *** mach_vm_map(size=842105263157895168) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
python.exe(67804,0x7fffa28f43c0) malloc: *** mach_vm_map(size=842105263157895168) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
python.exe(67804,0x7fffa28f43c0) malloc: *** mach_vm_map(size=421052631578947584) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
python.exe(67804,0x7fffa28f43c0) malloc: *** mach_vm_map(size=421052631578947584) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
0:00:22 load avg: 3.28 [15/44] test_difflib
0:00:23 load avg: 3.10 [16/44] test_embed
test test_embed failed
0:00:28 load avg: 3.17 [17/44] test_float -- test_embed failed (41 failures)
0:00:28 load avg: 3.17 [18/44] test_fstring
0:00:31 load avg: 3.17 [19/44] test_functools
./Modules/_decimal/libmpdec/context.c:57: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time

0:00:31 load avg: 3.17 [20/44] test_generators
0:00:32 load avg: 3.17 [21/44] test_hashlib
0:00:33 load avg: 3.08 [22/44] test_heapq
0:00:34 load avg: 3.08 [23/44] test_int
0:00:34 load avg: 3.08 [24/44] test_itertools
0:00:38 load avg: 2.99 [25/44] test_json
0:00:41 load avg: 2.99 [26/44] test_long -- test_json failed (env changed)
0:00:44 load avg: 2.83 [27/44] test_lzma
0:00:45 load avg: 2.83 [28/44] test_math
0:00:48 load avg: 2.84 [29/44] test_memoryview
0:00:49 load avg: 2.84 [30/44] test_operator
0:00:49 load avg: 2.84 [31/44] test_ordered_dict
0:00:51 load avg: 2.84 [32/44] test_patma
0:00:51 load avg: 2.84 [33/44] test_pickle
0:00:58 load avg: 2.63 [34/44] test_pprint
0:00:58 load avg: 2.63 [35/44] test_re
0:01:00 load avg: 2.63 [36/44] test_set -- test_re failed (env changed)
0:01:05 load avg: 2.50 [37/44] test_sqlite3
0:01:06 load avg: 2.50 [38/44] test_statistics -- test_sqlite3 failed (env changed)
0:01:08 load avg: 2.38 [39/44] test_str
0:01:11 load avg: 2.38 [40/44] test_struct -- test_str failed (env changed)
0:01:12 load avg: 2.38 [41/44] test_tabnanny -- test_struct failed (env changed)
0:01:13 load avg: 2.51 [42/44] test_time -- test_tabnanny failed (env changed)
0:01:15 load avg: 2.51 [43/44] test_xml_etree
0:01:16 load avg: 2.51 [44/44] test_xml_etree_c

Total duration: 1 min 18 sec
Total tests: run=8,985 failures=41 skipped=192
Total test files: run=44/44 failed=1 env_changed=8
Result: FAILURE
make: *** [Makefile:800: profile-run-stamp] Error 2

This is on macOS 10.12.6, building with LLVM / Clang 17.0.1. The system otherwise has Python 3.11.6 installed.

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

mdboom commented 1 year ago

I can confirm that backing out https://github.com/python/cpython/commit/3e3a7da590e1c3e5f03802e538f26c5204889c82 also resolves things for me.

vstinner commented 1 year ago

I wrote PR #110720 to revert my "NoLTO" change which introduced the bug.

I don't understand exactly why/how Programs/_testembed is miscompiled on macOS using LLVM 16 or 17, but it's bad: we should not miscompile Python. My change was just to optimize "build Python" command faster (run make), it should not miscompile Python.

It would be nice if someone can dig into the issue, but I don't have the bandwidth for that. I cannot reproduce the issue on Linux, and I don't have access to any macOS machine.

debohman commented 1 year ago

If I may hazard a guess as to why this is failing, we seem to be using objects built with LTO and linking them to create Programs/_testembed without LTO. Perhaps if you wish to to build _testembed without LTO, then you need a set of object files built without any LTO. Note that everything works fine when building Python without any LTO.

In the case which fails, the generated _testembed contains no symbols, so when it tries to runtime load a module, it fails because it cannot resolve the undefined symbols in the module.

vstinner commented 1 year ago

I suppose that the best way to build with LTO only once instead of 3 times (freeze, testembed, libpython) is to use ./configure --enable-shared. So most object files are only linked once into libpython, and then libpython is reused in freeze and testembed.

If I recall correctly, GCC implementation of LTO is very different and so the NOLTO hack to reduce build time when enable-shared is not used still works. Maybe we just use -flto=full in NOLTO on macOS (when cland is used).

debohman commented 1 year ago

Shall we close this issue? I have successfully built 3.13.0a1 and current main with both PGO and LTO.

debohman commented 1 year ago

I am also using the latest Clang 17.0.4.

vstinner commented 1 year ago

Ok, I close the issue.

Abhaygarg656 commented 1 month ago

Issues Identified Environment Changes: Several tests failed with the message (env changed). This often means that the test environment was altered in a way that affected the tests, possibly due to external factors or previous tests impacting the state.

Memory Allocation Errors: The output shows multiple instances of malloc: *** mach_vm_map(size=...) failed (error code=3), indicating that the system ran out of memory while trying to allocate space. This could be a result of a memory leak or simply running out of available memory.

Warnings: There are warnings about mpd_setminalloc being ignored, which may not be critical but should be noted.

Failures in Tests: A significant number of tests (41 failures) did not pass. Specific tests like test_base64, test_bytes, and others are mentioned as failed.

Troubleshooting Steps Increase File Descriptor Limit: The initial message about RLIMIT_NOFILE: 256 -> 1024 indicates that the limit on open files is low. You can try increasing this limit using the following command in your terminal before running your Python command:

bash Copy code ulimit -n 2048 Check System Resources: Ensure that your system has enough free memory and CPU resources. Close unnecessary applications to free up resources.

Run Tests Individually: Instead of running all tests at once, consider running them individually to identify which specific tests are failing. This can help isolate problematic tests:

bash Copy code ./python.exe -m test test_base64 Update or Rebuild Python: If you are using a custom build of Python, ensure it’s up-to-date. You may want to rebuild it using the latest version of LLVM and Clang, or consider using a pre-built version of Python.

Consult Documentation: Refer to the Python testing documentation or the issue tracker (e.g., GitHub) for known issues regarding the specific version you are using.

Run Without PGO: If you're using Profile-Guided Optimization (PGO), consider running the tests without it to see if the issues persist:

bash Copy code ./python.exe -m test --timeout= Check for Updates: Ensure you have the latest macOS updates and check for updates to LLVM/Clang.

Conclusion These steps should help you diagnose and potentially resolve the issues you are experiencing. If the problems persist, you may want to file a bug report with more detailed logs and information about your environment. Let me know if you need further assistance!