JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.56k stars 5.47k forks source link

`jl_record_backtrace` may miss tasks? #56046

Open d-netto opened 1 day ago

d-netto commented 1 day ago

We've been using jl_record_backtrace extensively in production to print backtraces when one of our servers hits a state of degraded performance or deadlock.

We're afraid, however, that there could be a ABA-like bug in jl_record_backtrace which may lead us to miss a task for which we're attempting to record the backtrace.

Here is the code for reference:

JL_DLLEXPORT size_t jl_record_backtrace(jl_task_t *t, jl_bt_element_t *bt_data, size_t max_bt_size) JL_NOTSAFEPOINT
{
    jl_task_t *ct = jl_current_task;
    jl_ptls_t ptls = ct->ptls;
    if (t == ct) {
        return rec_backtrace(bt_data, max_bt_size, 0);
    }
    bt_context_t *context = NULL;
    bt_context_t c;
    int16_t old = -1;
    while (!jl_atomic_cmpswap(&t->tid, &old, ptls->tid) && old != ptls->tid) {
        int lockret = jl_lock_stackwalk();
        // if this task is already running somewhere, we need to stop the thread it is running on and query its state
        if (!jl_thread_suspend_and_get_state(old, 1, &c)) {
            jl_unlock_stackwalk(lockret);
            if (jl_atomic_load_relaxed(&t->tid) != old)
                continue;
            return 0;
        }
        jl_unlock_stackwalk(lockret);
        if (jl_atomic_load_relaxed(&t->tid) == old) {
            jl_ptls_t ptls2 = jl_atomic_load_relaxed(&jl_all_tls_states)[old];
            if (ptls2->previous_task == t || // we might print the wrong stack here, since we can't know whether we executed the swapcontext yet or not, but it at least avoids trying to access the state inside uc_mcontext which might not be set yet
                (ptls2->previous_task == NULL && jl_atomic_load_relaxed(&ptls2->current_task) == t)) { // this case should be always accurate
                // use the thread context for the unwind state
                context = &c;
            }
            break;
        }
        // got the wrong thread stopped, try again
        jl_thread_resume(old);
    }
    if (context == NULL && (!t->ctx.copy_stack && t->ctx.started && t->ctx.ctx != NULL)) {
        // need to read the context from the task stored state
        jl_jmp_buf *mctx = &t->ctx.ctx->uc_mcontext;
#if defined(_OS_WINDOWS_)
        memset(&c, 0, sizeof(c));
        if (jl_simulate_longjmp(*mctx, &c))
            context = &c;
#elif defined(JL_HAVE_UNW_CONTEXT)
        context = t->ctx.ctx;
#elif defined(JL_HAVE_UCONTEXT)
        context = jl_to_bt_context(t->ctx.ctx);
#elif defined(JL_HAVE_ASM)
        memset(&c, 0, sizeof(c));
        if (jl_simulate_longjmp(*mctx, &c))
            context = &c;
#else
     #pragma message("jl_record_backtrace not defined for unknown task system")
#endif
    }
    size_t bt_size = 0;
    if (context)
        bt_size = rec_backtrace_ctx(bt_data, max_bt_size, context, t->gcstack);
    if (old == -1)
        jl_atomic_store_relaxed(&t->tid, old);
    else if (old != ptls->tid)
        jl_thread_resume(old);
    return bt_size;
}

The case which I think may be pathological is the following:

Is the case I'm describing here indeed pathological? Are there any invariants that I could be missing that will make this scenario impossible to occur?

Thanks in advance.

CC: @vtjnash.

vtjnash commented 1 day ago

Yes, good catch, there appears to be a missing re-assignment of old = -1; at the end of that loop which means in the ABA case, we accidentally actually acquire the lock on the thread despite not actually having stopped the thread; or in the counter-case, we try to run through this logic with old==-1 on the next iteration, and that isn't valid either (jl_thread_suspend_and_get_state should return failurea and the loop with abort too early)