Closed gmarkall closed 11 months ago
I hit an assertion in the Numba test suite on an M2 system:
Assertion failed: (false && "All memory must be pre-allocated"), function allocateSection, file memorymanager.cpp, line 107.
Fatal Python error: Aborted
Looking into which test caused this now.
To reproduce:
python runtests.py numba.tests.test_array_reductions.TestArrayReductions.test_nanquantile_basic
Looks like we somehow don't quite reserve enough space for code mem - with the -debug-only=llvmlite-memory-manager
flag set, I see:
Reserving 0xC000 bytes
Code mem starts at 0x0000000129BD0000, size 0x4000
Rwdata mem starts at 0x0x0000000129BD4000, size 0x4000
Allocating 0x4008 bytes for CodeMem at Assertion failed: (false && "All memory must be pre-allocated"), function allocateSection, file memorymanager.cpp, line 107.
Fatal Python error: Aborted
Status update - with the commit 61ae2b0 I can get through the whole test suite (with the usually-skipped tests not skipped):
diff --git a/numba/tests/test_array_constants.py b/numba/tests/test_array_constants.py
index a33dacd49..386c1856b 100644
--- a/numba/tests/test_array_constants.py
+++ b/numba/tests/test_array_constants.py
@@ -141,7 +141,6 @@ class TestConstantArray(unittest.TestCase):
out = cres.entry_point()
self.assertEqual(out, 86)
- @skip_m1_llvm_rtdyld_failure
def test_too_big_to_freeze(self):
"""
Test issue https://github.com/numba/numba/issues/2188 where freezing
diff --git a/numba/tests/test_stencils.py b/numba/tests/test_stencils.py
index 2a65c0370..1e2f8dc77 100644
--- a/numba/tests/test_stencils.py
+++ b/numba/tests/test_stencils.py
@@ -80,7 +80,6 @@ if not _32bit: # prevent compilation on unsupported 32bit targets
return a + 1
-@skip_m1_llvm_rtdyld_failure # skip all stencil tests on m1
class TestStencilBase(unittest.TestCase):
_numba_parallel_test_ = False
resulting in:
======================================================================
FAIL: test_no_accidental_warnings (numba.tests.test_import.TestNumbaImport)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/gmarkall/work/numbadev/numba/numba/tests/test_import.py", line 103, in test_no_accidental_warnings
run_in_subprocess(code, flags)
File "/Users/gmarkall/work/numbadev/numba/numba/tests/support.py", line 1121, in run_in_subprocess
raise AssertionError(msg % (popen.returncode, err.decode()))
AssertionError: process failed with code 1: stderr follows
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/gmarkall/work/numbadev/numba/numba/__init__.py", line 230, in <module>
_ensure_llvm()
File "/Users/gmarkall/work/numbadev/numba/numba/__init__.py", line 169, in _ensure_llvm
warnings.warn("llvmlite version format not recognized!")
UserWarning: llvmlite version format not recognized!
======================================================================
FAIL: test_unsafe_import_in_registry (numba.tests.test_np_functions.TestRegistryImports)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/gmarkall/work/numbadev/numba/numba/tests/test_np_functions.py", line 6172, in test_unsafe_import_in_registry
self.assertEquals(b"", error.strip())
AssertionError: b'' != b'/Users/gmarkall/work/numbadev/numba/numba[126 chars]d!")'
======================================================================
FAIL: test_repr_long_list_ipython (numba.tests.test_typedlist.TestTypedList)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/gmarkall/work/numbadev/numba/numba/tests/test_typedlist.py", line 563, in test_repr_long_list_ipython
self.assertEqual(expected, err)
AssertionError: 'ListType[int64]([0, 1, 2, 3, 4, 5, 6, 7, [4867 chars]..])' != '/Users/gmarkall/work/numbadev/numba/numba[5040 chars]..])'
Diff is 10176 characters long. Set self.maxDiff to None to see it.
======================================================================
FAIL: test_repr_long_list_ipython (numba.tests.test_typedlist.TestTypedList)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/gmarkall/work/numbadev/numba/numba/tests/support.py", line 909, in tearDown
self.memory_leak_teardown()
File "/Users/gmarkall/work/numbadev/numba/numba/tests/support.py", line 884, in memory_leak_teardown
self.assert_no_memory_leak()
File "/Users/gmarkall/work/numbadev/numba/numba/tests/support.py", line 893, in assert_no_memory_leak
self.assertEqual(total_alloc, total_free)
AssertionError: 2 != 1
----------------------------------------------------------------------
Ran 10387 tests in 1000.715s
FAILED (failures=4, skipped=639, expected failures=13)
I believe the failures are innocuous:
Still an issue on Linux AArch64, although this is maybe a latent bug in cleanup in Numba:
$ gdb --args python runtests.py numba.tests.test_ctypes.TestCTypesUseCases.test_python_call_back -v -m
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...
(No debugging symbols found in python)
(gdb) run
Starting program: /home/gmarkall/mambaforge/envs/numbadev/bin/python runtests.py numba.tests.test_ctypes.TestCTypesUseCases.test_python_call_back -v -m
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
[Detaching after vfork from child process 40625]
[New Thread 0xffffee7901e0 (LWP 40626)]
[New Thread 0xffffedf8f1e0 (LWP 40627)]
[New Thread 0xffffeb78e1e0 (LWP 40628)]
[New Thread 0xffffe8f8d1e0 (LWP 40629)]
[New Thread 0xffffe478c1e0 (LWP 40630)]
[New Thread 0xffffe3f8b1e0 (LWP 40631)]
[New Thread 0xffffdf78a1e0 (LWP 40632)]
[New Thread 0xffffdef891e0 (LWP 40633)]
[New Thread 0xffffda7881e0 (LWP 40634)]
[New Thread 0xffffd7f871e0 (LWP 40635)]
[New Thread 0xffffd57861e0 (LWP 40636)]
[Detaching after vfork from child process 40637]
[Detaching after vfork from child process 40638]
[Detaching after vfork from child process 40639]
[Detaching after vfork from child process 40640]
[Detaching after vfork from child process 40641]
/home/gmarkall/numbadev/numba/numba/__init__.py:169: UserWarning: llvmlite version format not recognized!
warnings.warn("llvmlite version format not recognized!")
Parallel: 1. Serial: 0
[Thread 0xffffdef891e0 (LWP 40633) exited]
[Thread 0xffffda7881e0 (LWP 40634) exited]
[Thread 0xffffe3f8b1e0 (LWP 40631) exited]
[Thread 0xffffedf8f1e0 (LWP 40627) exited]
[Thread 0xffffee7901e0 (LWP 40626) exited]
[Thread 0xffffdf78a1e0 (LWP 40632) exited]
[Thread 0xffffd57861e0 (LWP 40636) exited]
[Thread 0xffffd7f871e0 (LWP 40635) exited]
[Thread 0xffffe478c1e0 (LWP 40630) exited]
[Thread 0xffffe8f8d1e0 (LWP 40629) exited]
[Thread 0xffffeb78e1e0 (LWP 40628) exited]
[Detaching after fork from child process 40642]
[Detaching after fork from child process 40643]
[Detaching after fork from child process 40644]
[Detaching after fork from child process 40645]
[Detaching after fork from child process 40646]
[Detaching after fork from child process 40647]
[Detaching after fork from child process 40648]
[Detaching after fork from child process 40649]
[Detaching after fork from child process 40650]
[Detaching after fork from child process 40651]
[Detaching after fork from child process 40652]
[Detaching after fork from child process 40653]
[New Thread 0xffffd57861e0 (LWP 40654)]
[New Thread 0xffffd7f871e0 (LWP 40655)]
[New Thread 0xffffda7881e0 (LWP 40656)]
Code size / align: 0x4 / 4
ROData size / align: 0x0 / 1
RWData size / align: 0x0 / 1
Reserving 0x3000 bytes
Code mem starts at 0x0000FFFFF7286000, size 0x1000
Code size / align: 0x128 / 4
ROData size / align: 0x130 / 16
RWData size / align: 0x0 / 1
Reserving 0x3000 bytes
Code mem starts at 0x0000FFFFF7283000, size 0x1000
Rodata mem starts at 0x0x0000FFFFF7284000, size 0x1000
Requested size / alignment: 0x128 / 4
Allocating 0x12C bytes for CodeMem at 0x0000FFFFF7283000
Requested size / alignment: 0xE0 / 16
Allocating 0xF0 bytes for RODataMem at 0x0000FFFFF7284000
Requested size / alignment: 0x48 / 8
Allocating 0x50 bytes for RODataMem at 0x0000FFFFF72840E0
Code size / align: 0xD6C / 4
ROData size / align: 0x5B0 / 16
RWData size / align: 0xB0 / 8
Reserving 0x3000 bytes
Code mem starts at 0x0000FFFFEDC8E000, size 0x1000
Rodata mem starts at 0x0x0000FFFFEDC8F000, size 0x1000
Rwdata mem starts at 0x0x0000FFFFEDC90000, size 0x1000
Requested size / alignment: 0xD6C / 4
Allocating 0xD70 bytes for CodeMem at 0x0000FFFFEDC8E000
Requested size / alignment: 0x48D / 16
Allocating 0x4A0 bytes for RODataMem at 0x0000FFFFEDC8F000
Requested size / alignment: 0x8 / 8
Allocating 0x10 bytes for RWDataMem at 0x0000FFFFEDC90000
Requested size / alignment: 0x114 / 8
Allocating 0x120 bytes for RODataMem at 0x0000FFFFEDC8F490
Requested size / alignment: 0x8 / 8
Allocating 0x10 bytes for RWDataMem at 0x0000FFFFEDC90008
Requested size / alignment: 0x30 / 8
Allocating 0x38 bytes for RWDataMem at 0x0000FFFFEDC90010
test_python_call_back (numba.tests.test_ctypes.TestCTypesUseCases) ... ok
[Thread 0xffffd57861e0 (LWP 40654) exited]
[Thread 0xffffda7881e0 (LWP 40656) exited]
[Thread 0xffffd7f871e0 (LWP 40655) exited]
----------------------------------------------------------------------
Ran 1 test in 0.645s
OK
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x0000fffff73efb48 in dlfree () from /home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/lib-dynload/../../libffi.so.8
(gdb) bt
#0 0x0000fffff73efb48 in dlfree () from /home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/lib-dynload/../../libffi.so.8
#1 0x0000fffff7417768 in CThunkObject_dealloc ()
from /home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so
#2 0x0000aaaaaab3aa88 in free_keys_object ()
#3 0x0000aaaaaab3b398 in dict_dealloc ()
#4 0x0000fffff741060c in PyCFuncPtr_clear ()
from /home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so
#5 0x0000fffff74106c4 in PyCFuncPtr_dealloc ()
from /home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so
#6 0x0000aaaaaab63548 in subtype_dealloc ()
#7 0x0000aaaaaab3aa88 in free_keys_object ()
#8 0x0000aaaaaab3f0b4 in dict_tp_clear ()
#9 0x0000aaaaaac19af8 in gc_collect_main ()
#10 0x0000aaaaaac1a954 in _PyGC_CollectNoFail ()
#11 0x0000aaaaaabef3a0 in finalize_modules ()
#12 0x0000aaaaaabf25c4 in Py_FinalizeEx ()
#13 0x0000aaaaaabf35ec in Py_Exit ()
#14 0x0000aaaaaabf9058 in _PyErr_PrintEx ()
#15 0x0000aaaaaabf98e4 in _PyRun_SimpleFileObject ()
#16 0x0000aaaaaabf9bf0 in _PyRun_AnyFileObject ()
#17 0x0000aaaaaab0f888 in Py_RunMain ()
#18 0x0000aaaaaab0fec4 in Py_BytesMain ()
#19 0x0000fffff7d52e10 in __libc_start_main (main=0xaaaaaab04f90 <main>, argc=5, argv=0xfffffffff208, init=<optimised out>, fini=<optimised out>,
rtld_fini=<optimised out>, stack_end=<optimised out>) at ../csu/libc-start.c:308
#20 0x0000aaaaaab0e6b8 in _start ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
Still an issue on Linux AArch64, although this is maybe a latent bug in cleanup in Numba:
It turns out that this issue is unrelated to this PR - I need to raise a Numba issue shortly.
With the offending ctypes tests skipped, on Linux AArch64 the test results are quite similar to those on macOS:
======================================================================
FAIL: test_no_accidental_warnings (numba.tests.test_import.TestNumbaImport)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/gmarkall/numbadev/numba/numba/tests/test_import.py", line 103, in test_no_accidental_warnings
run_in_subprocess(code, flags)
File "/home/gmarkall/numbadev/numba/numba/tests/support.py", line 1121, in run_in_subprocess
raise AssertionError(msg % (popen.returncode, err.decode()))
AssertionError: process failed with code 1: stderr follows
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/gmarkall/numbadev/numba/numba/__init__.py", line 230, in <module>
_ensure_llvm()
File "/home/gmarkall/numbadev/numba/numba/__init__.py", line 169, in _ensure_llvm
warnings.warn("llvmlite version format not recognized!")
UserWarning: llvmlite version format not recognized!
======================================================================
FAIL: test_unsafe_import_in_registry (numba.tests.test_np_functions.TestRegistryImports)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/gmarkall/numbadev/numba/numba/tests/test_np_functions.py", line 6172, in test_unsafe_import_in_registry
self.assertEquals(b"", error.strip())
AssertionError: b'' != b'/home/gmarkall/numbadev/numba/numba/__ini[120 chars]d!")'
======================================================================
FAIL: test_repr_long_list_ipython (numba.tests.test_typedlist.TestTypedList)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/gmarkall/numbadev/numba/numba/tests/test_typedlist.py", line 563, in test_repr_long_list_ipython
self.assertEqual(expected, err)
AssertionError: 'ListType[int64]([0, 1, 2, 3, 4, 5, 6, 7, [4867 chars]..])' != '/home/gmarkall/numbadev/numba/numba/__ini[5034 chars]..])'
Diff is 10164 characters long. Set self.maxDiff to None to see it.
======================================================================
FAIL: test_repr_long_list_ipython (numba.tests.test_typedlist.TestTypedList)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/gmarkall/numbadev/numba/numba/tests/support.py", line 909, in tearDown
self.memory_leak_teardown()
File "/home/gmarkall/numbadev/numba/numba/tests/support.py", line 884, in memory_leak_teardown
self.assert_no_memory_leak()
File "/home/gmarkall/numbadev/numba/numba/tests/support.py", line 893, in assert_no_memory_leak
self.assertEqual(total_alloc, total_free)
AssertionError: 2 != 1
----------------------------------------------------------------------
Ran 11867 tests in 4344.252s
FAILED (failures=4, skipped=592, expected failures=24)
So as far as I can tell, there are no outstanding issues with the implementation in this PR in its present form.
As a follow-up on the cause of those fails - they are all rooted in the warning about the llvmlite version not being recognized being produced - not an actual issue.
This implements a memory manager based on the MCJIT
SectionMemoryManager
, with a preallocation strategy that ensures all segments of an object are placed within a single block of mapped memory. This is intended to resolve the relocation overflow issues on AArch64 (numba/numba#8567, numba/numba#9001), which occur when the GOT segment is far from the code segment.The changes are based on those by @MikaelSmith in https://github.com/llvm/llvm-project/pull/71968 and his code in https://github.com/MikaelSmith/impala/blob/ac8561b6b69530f9fa2ff2ae65ec7415aa4395c6/be/src/codegen/mcjit-mem-mgr.cc - there is additional discussion / background in the LLVM Discourse thread and on the aforementioned Numba issues.
I believe this is now ready for some review - notes to reviewers:
SectionMemoryManager
"as-standard" into llvmlite.cc @sjoerdmeijer for review.