python / cpython

The Python programming language
https://www.python.org
Other
62.4k stars 29.96k forks source link

Bad assumption on thread stack size makes python crash with musl libc #76488

Open a50fdca5-f4ed-40e8-96d3-ab34f39a7284 opened 6 years ago

a50fdca5-f4ed-40e8-96d3-ab34f39a7284 commented 6 years ago
BPO 32307
Nosy @gpshead, @bitdancer
PRs
  • python/cpython#26224
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/gpshead' closed_at = None created_at = labels = ['interpreter-core', '3.10', '3.9', 'type-crash', '3.11'] title = 'Bad assumption on thread stack size makes python crash with musl libc' updated_at = user = 'https://bugs.python.org/NatanaelCopa' ``` bugs.python.org fields: ```python activity = actor = 'gregory.p.smith' assignee = 'gregory.p.smith' closed = False closed_date = None closer = None components = ['Interpreter Core'] creation = creator = 'Natanael Copa' dependencies = [] files = [] hgrepos = [] issue_num = 32307 keywords = ['patch'] message_count = 8.0 messages = ['308226', '308228', '308293', '308294', '308304', '308305', '308307', '403184'] nosy_count = 5.0 nosy_names = ['gregory.p.smith', 'r.david.murray', 'python-dev', 'Natanael Copa', '4-launchpad-kalvdans-no-ip-org'] pr_nums = ['26224'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'crash' url = 'https://bugs.python.org/issue32307' versions = ['Python 3.9', 'Python 3.10', 'Python 3.11'] ```

    a50fdca5-f4ed-40e8-96d3-ab34f39a7284 commented 6 years ago

    Python assumes that the system default thread stack size is big enough for python, except for OSX and FreeBSD where stack size is explicitly set. With musl libc the system thread stack size is only 80k, which leads to hard crash before sys.getrecursionlimit() is hit.

    See: https://github.com/python/cpython/blob/master/Python/thread_pthread.h#L22

    Testcase:

    import threading
    import sys
    
    def f(n=0):
        try:
            print(n)
            f(n+1)
        except Exception:
            print("we hit recursion limit")
            sys.exit(0)
    
    t = threading.Thread(target=f)
    t.start()

    This can be pasted into:

    docker run --rm -it python:2.7-alpine docker run --rm -it python:3.6-alpine

    bitdancer commented 6 years ago

    Well, from our point of view it isn't a bad assumption, it's that muslc needs to be added to the list of exceptions. (I know almost nothing about this...I assume there is some reason we can't determine the stack size programatically?)

    Would you care to propose a PR?

    a50fdca5-f4ed-40e8-96d3-ab34f39a7284 commented 6 years ago

    With "bad assumption" I mean that currently the code assumes that all unknown systems has big enough default thread stack size. This would not be a "bad assumption" if POSIX (or any other standard) would specify a requirement on the default stack size. To my understanding, there are no such requirement in the spec, and thus, applications can not really assume anything about the default thread stack size.

    musl libc does not provide any __MUSL__ macro by design, because they consider "it’s a bug to assume a certain implementation has particular properties rather than testing" so we cannot really make exception for musl.

    https://wiki.musl-libc.org/faq.html#Q:_Why_is_there_no_\<code>MUSL\</code>_macro?

    I think the options we have are:

    I can of course make a check for musl in the configure script and special case it, but it still does not fix the issue: There is no guarantee that stack size is big enough for python's use for every existing and unknown future system.

    a50fdca5-f4ed-40e8-96d3-ab34f39a7284 commented 6 years ago

    I ran a test of pthread_attr_getstacksize() on various systems. Here is what they return:

    Linux/glibc: 8388608 MacOS: 524288 FreeBSD 11.1: 0 (ulimit -s gives 524288) NetBSD 7.1: 4194304 OpenBSD 6.2: 524288

    I could not see any #ifdef for __OpenBSD__ so I would expect to see segfaults on OpenBSD as well. Google gave me this:

    https://bugs.python.org/issue25342 https://github.com/alonho/pytrace/issues/8

    a50fdca5-f4ed-40e8-96d3-ab34f39a7284 commented 6 years ago

    I suggest a runtime check like this:

    diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h
    index b7463c0ca6..e9d0f638c9 100644
    --- a/Python/thread_pthread.h
    +++ b/Python/thread_pthread.h
    @@ -34,6 +34,8 @@
     #undef  THREAD_STACK_SIZE
     #define THREAD_STACK_SIZE       0x400000
     #endif
    +/* minimum size of the default thread stack size */
    +#define THREAD_STACK_MIN_DEFAULT 0x100000
     /* for safety, ensure a viable minimum stacksize */
     #define THREAD_STACK_MIN        0x8000  /* 32 KiB */
     #else  /* !_POSIX_THREAD_ATTR_STACKSIZE */
    @@ -187,6 +189,14 @@ PyThread_start_new_thread(void (*func)(void *), void *arg)
         PyThreadState *tstate = PyThreadState_GET();
         size_t stacksize = tstate ? tstate->interp->pythread_stacksize : 0;
         tss = (stacksize != 0) ? stacksize : THREAD_STACK_SIZE;
    +    if (tss == 0 && THREAD_STACK_SIZE == 0) {
    +        if (pthread_attr_getstacksize(&attrs, &tss) != 0) {
    +            pthread_attr_destroy(&attrs);
    +            return PYTHREAD_INVALID_THREAD_ID;
    +        }
    +       if (tss != 0 && tss < THREAD_STACK_MIN_DEFAULT)
    +           tss = THREAD_STACK_MIN_DEFAULT;
    +    }
         if (tss != 0) {
             if (pthread_attr_setstacksize(&attrs, tss) != 0) {
                 pthread_attr_destroy(&attrs);

    The reasoning here is:

    Scripts that uses threading.stack_size() will still work as before, and you can still manually set the stack size down to 32k (THREAD_STACK_MIN) if you need that.

    This should keep existing behavior for known systems like glibc, OSX, FreeBSD, NetBSD but will raise the thread stack size for musl libc and OpenBSD to 1MB.

    What do you think?

    bitdancer commented 6 years ago

    I think we need people who do a lot of work at the C level to evaluate this, so I've added a couple to the nosy list :)

    a50fdca5-f4ed-40e8-96d3-ab34f39a7284 commented 6 years ago

    Exactly same as previous patch, but refactored to only have a single pthread_attr_destroy() call instead of 3.

    diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h
    index b7463c0ca6..1006a168fa 100644
    --- a/Python/thread_pthread.h
    +++ b/Python/thread_pthread.h
    @@ -34,6 +34,8 @@
     #undef  THREAD_STACK_SIZE
     #define THREAD_STACK_SIZE       0x400000
     #endif
    +/* minimum size of the default thread stack size */
    +#define THREAD_STACK_MIN_DEFAULT 0x100000
     /* for safety, ensure a viable minimum stacksize */
     #define THREAD_STACK_MIN        0x8000  /* 32 KiB */
     #else  /* !_POSIX_THREAD_ATTR_STACKSIZE */
    @@ -167,7 +169,7 @@ unsigned long
     PyThread_start_new_thread(void (*func)(void *), void *arg)
     {
         pthread_t th;
    -    int status;
    +    int status = -1;
     #if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
         pthread_attr_t attrs;
     #endif
    @@ -187,11 +189,15 @@ PyThread_start_new_thread(void (*func)(void *), void *arg)
         PyThreadState *tstate = PyThreadState_GET();
         size_t stacksize = tstate ? tstate->interp->pythread_stacksize : 0;
         tss = (stacksize != 0) ? stacksize : THREAD_STACK_SIZE;
    +    if (tss == 0 && THREAD_STACK_SIZE == 0) {
    +        if (pthread_attr_getstacksize(&attrs, &tss) != 0)
    +           goto thread_abort;
    +       if (tss != 0 && tss < THREAD_STACK_MIN_DEFAULT)
    +           tss = THREAD_STACK_MIN_DEFAULT;
    +    }
         if (tss != 0) {
    -        if (pthread_attr_setstacksize(&attrs, tss) != 0) {
    -            pthread_attr_destroy(&attrs);
    -            return PYTHREAD_INVALID_THREAD_ID;
    -        }
    +        if (pthread_attr_setstacksize(&attrs, tss) != 0)
    +           goto thread_abort;
         }
     #endif
     #if defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
    @@ -209,6 +215,7 @@ PyThread_start_new_thread(void (*func)(void *), void *arg)
                                  );
    
     #if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
    +thread_abort:
         pthread_attr_destroy(&attrs);
     #endif
         if (status != 0)
    gpshead commented 2 years ago

    I left some comments on the PR. We can at least make it detect musl libc and improve its default behavior there.

    FWIW CPython's recursionlimit and the C stack have long been a thorn in the side of many things.

    Even when Python's thread_pthread.h is changed to set a better default when running under musl libc, there are still plenty of ways to run into problems. ex: Extension modules can create threads on their own that then call back into CPython. So can C/C++ code that embeds a Python interpreter. Those choose their own stack sizes. When they choose low values, surprising crashes ensue for people working on the Python side...

    As a feature (beyond just this issue): It'd be ideal to actually be able to introspect the C stack space remaining and issue a RecursionError whenever it falls below a threshold. Rather than just blindly using a runtime adjustable number that any Python code can tweak to allow crashes via sys.setrecursionlimit(). Other languages like Golang use a dynamic stack and allocate more on SIGSEGV.

    [meta/off-topic: Why does anyone use small thread stack sizes? This is often done to save virtual address space in applications that expect to have thousands of threads. It matters a lot less on 64-bit address space. But there are reasons...]