python / cpython

The Python programming language
https://www.python.org
Other
63.48k stars 30.4k forks source link

multiprocessing.BoundedSemaphore of 32-bit python could not work while cross compiling on linux platform #75354

Open d6777f48-b034-4e93-9da9-929f1c88aab8 opened 7 years ago

d6777f48-b034-4e93-9da9-929f1c88aab8 commented 7 years ago
BPO 31171
Nosy @pitrou, @benjaminp, @jiahongxujia
PRs
  • python/cpython#10563
  • python/cpython#3054
  • python/cpython#13353
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.7', 'library', 'type-crash'] title = 'multiprocessing.BoundedSemaphore of 32-bit python could not work while cross compiling on linux platform' updated_at = user = 'https://github.com/jiahongxujia' ``` bugs.python.org fields: ```python activity = actor = 'hongxu' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'hongxu' dependencies = [] files = [] hgrepos = [] issue_num = 31171 keywords = ['patch'] message_count = 5.0 messages = ['300052', '300053', '300054', '334351', '334352'] nosy_count = 4.0 nosy_names = ['pitrou', 'benjamin.peterson', 'Barry Davis', 'hongxu'] pr_nums = ['10563', '3054', '13353'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'crash' url = 'https://bugs.python.org/issue31171' versions = ['Python 2.7', 'Python 3.6', 'Python 3.7'] ```

    d6777f48-b034-4e93-9da9-929f1c88aab8 commented 7 years ago

    To build python for embedded Linux systems, (http://www.yoctoproject.org/docs/2.3.1/yocto-project-qs/yocto-project-qs.html) The 32-bit python's multiprocessing.BoundedSemaphore could not work.

    1. Prerequisite
      • Build 32bit python on 64bit host, add '-m32' to gcc
    1. Reproduce with simplifying the prerequisite
    $ git clone https://github.com/python/cpython.git; cd ./cpython
    
    $ ac_cv_pthread=no CC="gcc -m32" ./configure
    
    $ make -j 10
    $ ./python -c "import multiprocessing; pool_sema = multiprocessing.BoundedSemaphore(value=1); pool_sema.acquire(); pool_sema.release()"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ValueError: semaphore or lock released too many times

    And the issue also caused invoking put of 'multiprocessing.Queue' hung. $ ./python -c "import multiprocessing; queue_instance = multiprocessing.Queue(); queue_instance.put(('install'))"

    d6777f48-b034-4e93-9da9-929f1c88aab8 commented 7 years ago
    1. Analysis 1) The multiprocessing invokes named semaphore in C library (sem_open/sem_post/sem_getvalue/sem_unlink in ./Modules/_multiprocessing/semaphore.c)

    2) The glibc defines two different sem_getvalue since the following commit. https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/sem_getvalue.c;h=1432cc795ece84d5bf31c7e5cafe01cc1a09d98d;hb=042e1521c794a945edc43b5bfa7e69ad70420524

    The __new_sem_getvalue' is the sem_getvalue@@GLIBC_2.1 which work with named semaphoresem_open'

    The __old_sem_getvalue' is the old version sem_getvalue@GLIBC_2.0 which work with unamed semaphoresem_init'

    In 32-bit C library, it provides both of them. $ nm -g /lib/i386-linux-gnu/libpthread-2.23.so 0000df30 T sem_getvalue@GLIBC_2.0 0000df10 T sem_getvalue@@GLIBC_2.1

    3) In multiprocessing, it invokes sem_open, so sem_getvalue@@GLIBC_2.1 is supposed.

    If -pthread' or-lpthread' not passed to gcc, the compiled _multiprocessing dynamic library could not explicitly linked to sem_getvalue@@GLIBC_2.1 $ nm -g ./build/lib.linux-x86_64-3.7/_multiprocessing.cpython-37m-i386-linux-gnu.so U sem_getvalue U sem_open

    If -pthread' or-lpthread' passed to gcc: $ nm -g ./build/lib.linux-x86_64-3.7/_multiprocessing.cpython-37m-i386-linux-gnu.so U sem_getvalue@@GLIBC_2.1 U sem_open@@GLIBC_2.1.1

    4) On 32-bit OS, the multiprocessing was incorrectly linked to sem_getvalue@GLIBC_2.0 which caused the issue.

    d6777f48-b034-4e93-9da9-929f1c88aab8 commented 7 years ago
    1. Solution

    For cross compiling, there is no -pthread', so we should explicitly add -lpthread' to build multiprocessing.

    Peterson tried to do it in the following commit: ... commit e711cafab13efc9c1fe6c5cd75826401445eb585 Author: Benjamin Peterson \benjamin@python.org\ Date: Wed Jun 11 16:44:04 2008 +0000

    Merged revisions 64104,64117 via svnmerge from
    svn+ssh://pythondev@svn.python.org/python/trunk

    ... git show e711cafab13efc9c1fe6c5cd75826401445eb585 -- setup.py

    --- a/setup.py
    +++ b/setup.py
    @@ -1110,6 +1110,56 @@ class PyBuildExt(build_ext):
    
             # _fileio -- supposedly cross platform
             exts.append(Extension('_fileio', ['_fileio.c']))
    +        # Richard Oudkerk's multiprocessing module
    +        if platform == 'win32':             # Windows
    +            macros = dict()
    +            libraries = ['ws2_32']
    +
    +        elif platform == 'darwin':          # Mac OSX
    +            macros = dict(
    +                HAVE_SEM_OPEN=1,
    +                HAVE_SEM_TIMEDWAIT=0,
    +                HAVE_FD_TRANSFER=1,
    +                HAVE_BROKEN_SEM_GETVALUE=1
    +                )
    +            libraries = []
    +
    +        elif platform == 'cygwin':          # Cygwin
    +            macros = dict(
    +                HAVE_SEM_OPEN=1,
    +                HAVE_SEM_TIMEDWAIT=1,
    +                HAVE_FD_TRANSFER=0,
    +                HAVE_BROKEN_SEM_UNLINK=1
    +                )
    +            libraries = []
    +        else:                                   # Linux and other unices
    +            macros = dict(
    +                HAVE_SEM_OPEN=1,
    +                HAVE_SEM_TIMEDWAIT=1,
    +                HAVE_FD_TRANSFER=1
    +                )
    +            libraries = ['rt']
    +
    +        if platform == 'win32':
    +            multiprocessing_srcs = [ '_multiprocessing/multiprocessing.c',
    +                                     '_multiprocessing/semaphore.c',
    +                                     '_multiprocessing/pipe_connection.c',
    +                                     '_multiprocessing/socket_connection.c',
    +                                     '_multiprocessing/win32_functions.c'
    +                                   ]
    +
    +        else:
    +            multiprocessing_srcs = [ '_multiprocessing/multiprocessing.c',
    +                                     '_multiprocessing/socket_connection.c'
    +                                   ]
    +
    +            if macros.get('HAVE_SEM_OPEN', False):
    +                multiprocessing_srcs.append('_multiprocessing/semaphore.c')
    +
    +        exts.append ( Extension('_multiprocessing', multiprocessing_srcs,
    +                                 define_macros=list(macros.items()),
    +                                 include_dirs=["Modules/_multiprocessing"]))
    +        # End multiprocessing

    It defined variable `libraries' and assigned it according to host_platform, but forgot to pass it to Extension.

    So we should correct it, and add `-lpthread' for linux.

    0cae0952-c943-4029-91ca-1fd8162b1fb1 commented 5 years ago

    I've just hit this issue using Python-2.7.9, gcc-8.1.0, glibc-2.23.

    The patch I made to fix the issue based on comments in this issue:

    --- Python-2.7.9/setup.py 2019-01-25 09:30:39.049501423 +0000 +++ Python-2.7.9/setup.py 2019-01-25 09:30:55.369609646 +0000 @@ -1670,7 +1670,7 @@

             else:                                   # Linux and other unices
                 macros = dict()
    -            libraries = ['rt']
    +            libraries = ['rt', 'pthread']
             if host_platform == 'win32':
                 multiprocessing_srcs = [ '_multiprocessing/multiprocessing.c',
    @@ -1690,6 +1690,7 @@
    
             if sysconfig.get_config_var('WITH_THREAD'):
                 exts.append ( Extension('_multiprocessing', multiprocessing_srcs,
    +                                    libraries = libraries,
                                         define_macros=macros.items(),
                                         include_dirs=["Modules/_multiprocessing"]))
             else:
    0cae0952-c943-4029-91ca-1fd8162b1fb1 commented 5 years ago

    The behaviour I saw (32-bit only) was a python process getting stuck.

    I got this from strace: ... futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xb5acc000, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) ...

    And this from gdb: (gdb) bt

    0 0xb76fbc5d in __kernel_vsyscall ()

    1 0xb74af2a4 in sem_wait () from /lib/libpthread.so.0

    2 0xb69a5429 in ?? () from /usr/lib/python2.7/lib-dynload/_multiprocessing.so

    3 0xb75a262e in call_function (oparg=\<optimized out>, pp_stack=0xbfb7f038) at Python/ceval.c:4033

    4 PyEval_EvalFrameEx (f=\<optimized out>, throwflag=\<optimized out>) at Python/ceval.c:2679

    5 0xb75a2f7d in PyEval_EvalCodeEx (co=\<optimized out>, globals=

    {'Queue': \<type at remote 0xb69c43dc\>, 'atexit': \<module at remote 0xb6d369bc\>, 'Semaphore': \<type at remote 0xb69b978c\>, 'Empty': \<type at remote 0xb69b902c\>, 'Full': \<type at remote 0xb69b9204\>, 'SimpleQueue': \<type at remote 0xb69c478c\>, 'assert_spawning': \<function at remote 0xb69ba684\>, '\_\_all__': ['Queue', 'SimpleQueue', 'JoinableQueue'], '\_multiprocessing': \<module at remote 0xb6c16944\>, '\_sentinel': \<object at remote 0xb726f4d0\>, '\_\_package__': 'multiprocessing', 'collections': \<module at remote 0xb71bce3c\>, '\_\_doc__': None, 'Condition': \<type at remote 0xb69c402c\>, 'JoinableQueue': \<type at remote 0xb69c45b4\>, '\_\_builtins__': {'bytearray': \<type at remote 0xb76c27e0\>, 'IndexError': \<type at remote 0xb76c6900\>, 'all': \<built-in function all\>, 'help': \<_Helper at remote 0xb71f824c\>, 'vars': \<built-in function vars\>, 'SyntaxError': \<type at remote 0xb76c6c80\>, 'unicode': \<type at remote 0xb76d6c00\>, 'UnicodeDecodeError': \<type at remote 0xb76c6420\>, 'memoryview': \<type at remote 0xb76cdda0\>, 'isinstance...(truncated), locals=0x0, args=0xb535f5cc, argcount=2, kws=0xb535f5d4, kwcount=0, defs=0xb69b6058, defcount=2, closure=0x0) at [Python/ceval.c:3265](https://github.com/python/cpython/blob/main/Python/ceval.c#L3265)

    6 0xb75a0f3d in fast_function (nk=0, na=\<optimized out>, n=2, pp_stack=0xbfb7f178, func=\<function at remote 0xb69c51b4>) at Python/ceval.c:4129

    7 call_function (oparg=\<optimized out>, pp_stack=0xbfb7f178) at Python/ceval.c:4054

    8 PyEval_EvalFrameEx (f=\<optimized out>, throwflag=\<optimized out>) at Python/ceval.c:2679

    ... (gdb) py-bt

    4 Frame 0xb536602c, for file /usr/lib/python2.7/multiprocessing/queues.py, line 101, in put (self=\<Queue(_writer=\<_multiprocessing.Connection at remote 0x96878c8>, _recv=\<built-in method recv of _multiprocessing.Connection object at remote 0x967ddc8>, _thread=None, _sem=\<BoundedSemaphore(release=\<built-in method release of _multiprocessing.SemLock object at remote 0xb53427c0>, acquire=\<built-in method acquire of _multiprocessing.SemLock object at remote 0xb53427c0>, _semlock=\<_multiprocessing.SemLock at remote 0xb53427c0>) at remote 0xb53425cc>, _buffer=\<collections.deque at remote 0xb533f844>, _closed=False, _send=\<built-in method send of _multiprocessing.Connection object at remote 0x96878c8>, _jointhread=None, _reader=None, _opid=3752, _rlock=\<Lock(release=\<built-in method release of _multiprocessing.SemLock object at remote 0xb5342420>, acquire=\<built-in method acquire of _multiprocessing.SemLock object at remote 0xb5342420>, _semlock=\<multiprocessing.SemLock at remote 0xb5342420>) at remote 0xb534240c>, \...(truncated)

    ...