microsoft / mimalloc

mimalloc is a compact general purpose allocator with excellent performance.
MIT License
9.74k stars 793 forks source link

2.1.2: Assertion failure on arm64 (16 vCPU graviton2) #851

Closed jkriegshauser closed 2 months ago

jkriegshauser commented 4 months ago

This seems similar to https://github.com/microsoft/mimalloc/issues/354 and I suspect the solution is similar.

Mimalloc v2.1.2

Assertion:

mimalloc: assertion failed: at "/builds/omniverse/externals/conan-transition/_build/conancache/67342464/.conan/data/mimalloc/2.1.2/_/_/build/b4cf5d3864c7b8c3fbf572c574794ebe20e5ce13/src/src/alloc.c":109, mi_heap_malloc_small_zero
  assertion: "heap->thread_id == 0 || heap->thread_id == tid"

This is running on a m6g.4xlarge AWS instance with a graviton2 CPU.

At application startup, _mi_heap_main.thread_id is initialized to 0xfffff7fefcd0 and that is what _mi_thread_id() returns as well.

However, at some point the return value of _mi_thread_id() changes:

(gdb) p /x _mi_thread_id()
$12 = 0x2995c90

It's not immediately apparent what triggers it.

Worth pointing out that neither value matches what GDB thinks the thread ID or the LWP is:

(gdb) i thr
  Id   Target Id         Frame
* 1    Thread 0xfffff7fee880 (LWP 27765) "test.unit" __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51

The library is built on a different host using Conan, and I'm not sure what the params of that host are.

Kernel is 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP and GLIBC is 2.27-3ubuntu1.5 on the host that it's running on.

daanx commented 4 months ago

That assertion should not fail -- I pushed a possible fix (to dev for now). Can you test it?

jkriegshauser commented 4 months ago

No dice. Seems like for whatever reason __has_builtin(__builtin_thread_pointer) is never evaluated on my compiler (g++ (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0) even though __builtin_thread_pointer() absolutely compiles and works (see godbolt links below).

When i remove the defined(__has_builtin) && __has_builtin(__builtin_thread_pointer) part of that conditional it builds on this platform, and runs... kinda. I get to a point where mi_free() is called and crashes because segment is 0x0. Looks like I'm calling free() and mi_free() is being called instead somehow. I'm not trying to override malloc/free, in fact I don't want to, so i'm not sure what's going on there, but that's a separate issue.

I did a quick test on godbolt with variations of:

#include <stdio.h>

#if defined(__has_builtin) && !__has_builtin(__builtin_thread_pointer)
#error "Don't have"
#endif

int main()
{
    printf("%p\n", __builtin_thread_pointer());
}

[godbolt] x86-64, gcc 7.1 - 9.5 don't like the compound preprocessor statement #if defined(__has_builtin) && !__has_builtin(__builtin_thread_pointer), but this part succeeds if these are split to separate #if statements. Also odd that it succeeds, but then gives an error on the printf line: error: 'builtin_thread_pointer' is not supported on this target (even though the has_builtin succeeds)

[godbolt] x86-64 gcc 10.x are okay with the compound preprocessor statement, but still gives the error on the printf line: error: 'builtin_thread_pointer' is not supported on this target (even though the has_builtin succeeds)

[godbolt] x86-64 gcc 11.1 - 13.2 built succeesfully.

[godbolt] ARM64 gcc 7.5 still doesn't like the compound statement, but will compile separated out, but oddly the interior of the #if __has_builtin(__builtin_thread_pointer) is never reached. In the linked example it should error but it doesn't. So I can call it but it doesn't report that it has it.

daanx commented 3 months ago

Apologies for the late followup; wow-- thanks for trying all these variations .. but yikes, not sure how to deal with such inconsistent behaviour. Even without thread_pointer the assembly should work I think.. or otherwise the portable fall back to a thread local solution; can you try to run with that option in include/prim.h:248. ? (you would need to disable some of the preceding #ifs.

jkriegshauser commented 3 months ago

I’m on vacation this week and most of next but will try to spare some time to take a look when I’m back.

On Tue, Mar 26, 2024 at 9:51 AM Daan @.***> wrote:

Apologies for the late followup; wow-- thanks for trying all these variations .. but yikes, not sure how to deal with such inconsistent behaviour. Even without thread_pointer the assembly should work I think.. or otherwise the portable fall back to a thread local solution; can you try to run with that option in include/prim.h:248. ? (you would need to disable some of the preceding #ifs.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/mimalloc/issues/851#issuecomment-2019119358, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAM6MXYFKV4LRGGXJJH65XTY2C2ABAVCNFSM6AAAAABDIYR5QKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJZGEYTSMZVHA . You are receiving this because you authored the thread.Message ID: @.***>

daanx commented 2 months ago

I think release v1.8.4 and v2.1.4 should work now without troubles? However, it seems the fix caused build errors on other platforms (issue #883) so I just changed the default to use TLS slots again --@jkriegshauser if you can, please try with the latest dev or dev-slice if this works for you.