colesbury / nogil

Multithreaded Python without the GIL
Other
2.91k stars 107 forks source link

Track a more recent upstream version of CPython that supports the Apple M1 platform? #36

Closed ogrisel closed 2 years ago

ogrisel commented 2 years ago

I would like to test this experimental branch on a regular basis on my Apple M1 dev laptop when working on the Scientific Python / PyData related projects so as to be able to report problems or performance improvements encountered with practical application cases.

However the build setup in this branch is a bit too old and the configure refuses to generate the Makefile for this platform:

./configure --with-openssl=$(brew --prefix openssl)
checking for git... found
checking build system type... Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not recognized
configure: error: /bin/sh ./config.sub arm64-apple-darwin20.0.0 failed

Would it be possible to sync with branch with a more recent version of CPython (e.g. 3.10)?

EDIT: the error above was probably observed with a version of clang installed from conda-forge. I get a different error at build time when using the system compiler instead, see the discussion below: https://github.com/colesbury/nogil/issues/36#issuecomment-1046302795

colesbury commented 2 years ago

I'm not planning on rebasing on to 3.10 soon, since rebasing is a lot of work, but I'd like to support Apple's M1 cpus.

Python 3.9.1+ has native support for Apple silicon -- I'm not sure why it doesn't work. I'll look into it more in the next few weeks when I get access to an M1 laptop.

ogrisel commented 2 years ago

BTW thanks for leading the nogil Python effort. As a maintainer of loky / joblib and scikit-learn, I really think the nogil project is very important for the future of the Python / data / ML ecosystem. loky is the default backend for multi-processing for joblib and therefore scikit-learn and we spent countless hours trying to make it work efficiently and robustly and its maintenance if far from free. All of this complexity would go away if we could efficiently use threads instead of sub-processes and having to deal with all kind of workarounds to try to limit memory usage and the overheads and brittleness inter-process communication via pickling.

thedrow commented 2 years ago

@ogrisel If you maintain loky, we should talk as we're looking to replace billiard with something that is easier to understand and debug. Drop me an email :)

anna-hope commented 2 years ago

@ogrisel @colesbury Using an M1 MacBook Pro, I can successfully run ./configure --with-openssl=$(brew --prefix openssl) and generate a Makefile. However, when trying to actually run make -j, I get

<...>
gcc     -Wl,-stack_size,1000000  -framework CoreFoundation -o Programs/_testembed Programs/_testembed.o libpython3.9.a -ldl   -framework CoreFoundation
ld: warning: arm64 Linker Optimiztion Hint addresses are not in same atom: 0x000088E8 and 0x000088F8
ld: warning: arm64 Linker Optimiztion Hint addresses are not in same atom: 0x000088E8 and 0x000088F8
Undefined symbols for architecture arm64:
Undefined symbols for architecture arm64:
  "___llvm_profile_runtime", referenced from:
  "___llvm_profile_runtime", referenced from:
      ___llvm_profile_runtime_user in python.o
      ___llvm_profile_runtime_user in _testembed.o
      ___llvm_profile_runtime_user in libpython3.9.a(main.o)
      ___llvm_profile_runtime_user in libpython3.9.a(getargs.o)
      ___llvm_profile_runtime_user in libpython3.9.a(bytesobject.o)
      ___llvm_profile_runtime_user in libpython3.9.a(initconfig.o)
      ___llvm_profile_runtime_user in libpython3.9.a(errors.o)
      ___llvm_profile_runtime_user in libpython3.9.a(initconfig.o)
      ___llvm_profile_runtime_user in libpython3.9.a(pythonrun.o)
      ___llvm_profile_runtime_user in libpython3.9.a(ceval_gil.o)
      ___llvm_profile_runtime_user in libpython3.9.a(errors.o)
      ___llvm_profile_runtime_user in libpython3.9.a(pythonrun.o)
      ___llvm_profile_runtime_user in libpython3.9.a(exceptions.o)
      ...
      ___llvm_profile_runtime_user in libpython3.9.a(exceptions.o)
      ...
     (maybe you meant: ___llvm_profile_runtime_user     (maybe you meant: ___llvm_profile_runtime_user)
)
ld: symbol(s) not found for architecture arm64
ld: symbol(s) not found for architecture arm64
clangclang: : error: error: linker command failed with exit code 1 (use -v to see invocation)linker command failed with exit code 1 (use -v to see invocation)

make: *** [Programs/_testembed] Error 1
make: *** Waiting for unfinished jobs....
make: *** [python.exe] Error 1

Update FWIW, running make clean and then re-running make -j produced different errors:

Python/ceval.c:1109:5: error: invalid symbol redefinition
    TARGET(RETURN_VALUE) {
    ^
Python/ceval.c:252:28: note: expanded from macro 'TARGET'
#define TARGET(name) name: DEBUG_LABEL(name);
                           ^
Python/ceval.c:57:44: note: expanded from macro 'DEBUG_LABEL'
#define DEBUG_LABEL(name) __asm__ volatile("." #name ":")
                                           ^
<inline asm>:1:2: note: instantiated into assembly here
        .RETURN_VALUE:
        ^
Python/ceval.c:1109:5: error: invalid symbol redefinition
    TARGET(RETURN_VALUE) {
    ^
Python/ceval.c:252:28: note: expanded from macro 'TARGET'
#define TARGET(name) name: DEBUG_LABEL(name);
                           ^
Python/ceval.c:57:44: note: expanded from macro 'DEBUG_LABEL'
#define DEBUG_LABEL(name) __asm__ volatile("." #name ":")
                                           ^
<inline asm>:1:2: note: instantiated into assembly here
        .RETURN_VALUE:
        ^
2 errors generated.
make: *** [Python/ceval.o] Error 1
ogrisel commented 2 years ago

Thanks @anna-hope. I did a git clean -xdf / git pull and then reran the configure script and this time it worked. Then I get the same 2 errors as you get when building Python/ceval.c.

ogrisel commented 2 years ago

For information, I am now running macOS 12.2.1 and the compiler env var evals to:

> ${CEVAL_CC:-gcc} --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 13.0.0 (clang-1300.0.29.30)
Target: arm64-apple-darwin21.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
oraluben commented 2 years ago

I think the Python/ceval.c issue is the same as https://stackoverflow.com/questions/14506151/invalid-symbol-redefinition-in-inline-asm-on-llvm. I tried a debug build and it disappear, but there's a new segfault related to mimalloc:

Already fixed by mimalloc >= 1.7.2 ``` ... ./python.exe -E -S -m sysconfig --generate-posix-vars ;\ if test $? -ne 0 ; then \ echo "generate-posix-vars failed" ; \ rm -f ./pybuilddir.txt ; \ exit 1 ; \ fi /bin/sh: line 1: 56289 Segmentation fault: 11 ./python.exe -E -S -m sysconfig --generate-posix-vars ``` and ``` $ lldb -- ./python.exe -E -S -m sysconfig --generate-posix-vars ... (lldb) down frame #0: 0x000000010023956c python.exe`mi_tls_slot(slot=0) at mimalloc-internal.h:722:9 719 #elif defined(__aarch64__) 720 void** tcb; UNUSED(ofs); 721 asm volatile ("mrs %0, tpidr_el0" : "=r" (tcb)); -> 722 res = tcb[slot]; 723 #endif 724 return res; 725 } (lldb) p tcb warning: could not find Objective-C class data in the process. This may reduce the quality of type information available. (void **) $0 = 0x0000000000000004 ```

after adopting patch (microsoft/mimalloc#346) from mimalloc, I got another error:

$ make -j
...
./python.exe -E -S -m sysconfig --generate-posix-vars ;\
        if test $? -ne 0 ; then \
                echo "generate-posix-vars failed" ; \
                rm -f ./pybuilddir.txt ; \
                exit 1 ; \
        fi
Fatal Python error: _Py_queue_object: _Py_queue_object called with unowned object
Python runtime state: preinitialized

Current thread 0x0000000100ff4580 (most recent call first):
<no Python frame>
/bin/sh: line 1: 76391 Abort trap: 6           ./python.exe -E -S -m sysconfig --generate-posix-vars

and

$ lldb -- ./python.exe -E -S -m sysconfig --generate-posix-vars
...
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xfffffffffffe7f68)
    frame #0: 0x0000000100371444 python.exe`find_thread_state(bucket=0x00000001005a1a18, thread_id=7) at pyrefcnt.c:34:24
   31       llist_for_each(node, &bucket->threads) {
   32           PyThreadStateImpl *ts;
   33           ts = llist_data(node, PyThreadStateImpl, brc.bucket_node);
-> 34           if (ts->tstate.fast_thread_id == thread_id) {
   35               return ts;
   36           }
   37       }
Target 0: (python.exe) stopped.
colesbury commented 2 years ago

Thanks for the bug report and help with finding the underlying issues. I've pushed a branch "nogil-3.9.10" (https://github.com/colesbury/nogil/tree/nogil-3.9.10) that works on the Apple M1. It still needs some more testing, but I plan to eventually change the default "nogil" branch to point to it. For now, the "nogil-3.9.10" is unstable in the sense that I may push changes that don't preserve history.

The significant changes are:

Please try it out if you can and let me know if you run into any problems.

Aljutor commented 2 years ago

Looks like this version does not have cffi package for it, I am getting Unable to find installation candidates for cffi (1.14.6)

With this in pyproject.toml

[[tool.poetry.source]]
name = "nogil"
url = "https://d1yxz45j0ypngg.cloudfront.net/"
colesbury commented 2 years ago

This is fixed now in the nogil branch. I've forced pushed to that branch, so you'll need to either re-clone the repository or, if you want to keep an existing checkout, run:

git fetch origin
git reset --hard origin/nogil

in order to get the changes in the nogil branch. Note that the fixed version of nogil Python should report as "Python 3.9.10".