ysbaddaden / gc

A garbage collector for Crystal
95 stars 6 forks source link

Error when building my program #11

Closed Dan-Do closed 2 years ago

Dan-Do commented 2 years ago
immix.cr:44:5
44 | Scheduler.reschedule
      ^--------
Error: undefined constant Scheduler

I am using crystal 1.2.1

Dan-Do commented 2 years ago

After changing to Crystal::Scheduler it builds ok, but there is error Segmentation fault when running.

ysbaddaden commented 2 years ago

I last worked on this GC years ago. Crystal made lots of changes since

[0] from 0x00005555556214e1 in checkout+17 at /usr/share/crystal/src/fiber/stack_pool.cr:27
[1] from 0x00005555556235be in initialize+126 at /usr/share/crystal/src/fiber.cr:90
[2] from 0x0000555555623528 in new+168 at /usr/share/crystal/src/fiber.cr:88
[3] from 0x00005555555d205c in spawn:name+44 at /usr/share/crystal/src/concurrent.cr:61
[4] from 0x000055555560fd9f in init+63 at /home/github/immix.cr/src/immix.cr:13
[5] from 0x00005555556c8e7e in main+30 at /usr/share/crystal/src/crystal/main.cr:35
[6] from 0x00005555555ccef6 in main+6 at /usr/share/crystal/src/crystal/main.cr:119

Immix spawns a Fiber when starting up, it will be used to run the GC collector loop. It fails while trying to access @mutex.

Delaying the spawn of the Fiber will fix that segfault (see 9fb592c6e86a67e3a937abb6a957997f60a74462), meaning that the GC allocator seems to be working, but then I get a segfault when collecting on a larger program (samples/http_server.cr):

[0] from 0x00005555555621d4 in current+20 at /usr/share/crystal/src/crystal/system/unix/pthread.cr:62
[1] from 0x00005555555840fa in current_fiber+10 at /usr/share/crystal/src/crystal/scheduler.cr:16
[2] from 0x00005555555840fa in current+10 at /usr/share/crystal/src/fiber.cr:161
[3] from 0x00005555555840fa in collect+10 at /home/github/immix.cr/src/immix.cr:38
[4] from 0x00005555555840fa in gc_collect+10 at /home/github/immix.cr/src/immix.cr:4
[5] from 0x00005555555edab7 in GlobalAllocator_tryCollect+294 at src/global_allocator.c:147
[6] from 0x00005555555edab7 in GC_GlobalAllocator_nextBlock+359 at src/global_allocator.c:164
[7] from 0x00005555555ee59d in LocalAllocator_initCursor+8 at src/local_allocator.c:10
[8] from 0x00005555555ee59d in GC_LocalAllocator_allocateSmall+125 at src/local_allocator.c:135
[9] from 0x000055555557d47b in malloc+10 at /home/github/immix.cr/src/immix.cr:22

I tried to remove Fibers altogether, but it still crashes.

I guess there is something wrong. Maybe in the C library, maybe in the Crystal integration.

It would be interesting to use the GC in a C program to validate the library's integrity.

Dan-Do commented 2 years ago

Thank you for looking into this. I am eager to test this promising GC. I need speed 🐎

ysbaddaden commented 2 years ago

You should start by tweaking BDWGC from environment variables if you really need to optimize memory/speed. This GC is experimental and somewhat worked at some point (against Crystal ~0.24), with acceptable performance above BDWGC, with a simple HTTP::Server or for the Crystal compiler itself, but using more memory. Anyway, it's currently broken :sob:

It's also not thread-safe, hence not compatible with MT. The paper does take thread safety into account, but I'm missing some Mutexes here and there (i.e. GlobalAllocator).

Dan-Do commented 2 years ago

Thank you @ysbaddaden What do you think if I tweak BDWGC using mimalloc or rpmalloc? Does it gain any speed? Is it doable for a C novice/beginner?

ysbaddaden commented 2 years ago

Huh? You can't change the allocator for a GC... the allocator is builtin to its design :confused:

ysbaddaden commented 2 years ago

I investigated a bit, and it broke at different point during crystal's development. I found and quickly patched the problematic changes in Crystal, but at some point crystal binaries went from LLVM 4 to LLVM 8 and the generated binaries start crashing (on Linux glibc).

For example if I compile using 0.32.0 stdlib with the official crystal 0.30.1 binary (and a patch to thread.cr) then programs are running. If I use the official 0.31.0 binary, then I get an immediate segfault:

$ make -B CUSTOM="-DGC_DEBUG"
$ ~/src/crystal/bin/crystal spec -Dgc_none
Using compiled compiler at `.build/crystal'
GC: heap size=4194304 start=0x7f469a4b0000 stop=0x7f469a8b0000 large_start=0x7f42b7cd8000 large_stop=0x7f42b80d8000
GC: malloc object=0x7f469a4b0100 size=120 actual=136 atomic=0 ptr=0x7f469a4b0110
Program exited because of a segmentation fault (11)

Yeah, it crashes right after the first GC.malloc :scream:

Either something changed in the LLVM codegen between 0.30.1 and 0.31.0, or LLVM itself changed something in its codegen, and this is impacting this GC library.

Dan-Do commented 2 years ago

@ysbaddaden Did you try to compile using crystal/stdlib 1.4 and LLVM 14?

ysbaddaden commented 2 years ago

No, I can't install the LLVM 14 packages for Ubuntu Bionic on... Ubuntu Bionic. It assumes libgcc-s1 which is only available since Focal.

Yet, I think there is an issue returning the pointer from GC_malloc() to GC.malloc() or something around those lines. The GC is capable to malloc, and the C tests seem to prove it —thought maybe those tests could try to allocate a struct and manipulate its contents. Here, as soon as the GC does a malloc Crystal crashes trying to access an ivar, so it got a wrong pointer; a quick gdb session shows none of the registers have a pointer to the allocated memory, so :shrug:

ysbaddaden commented 2 years ago

I investigated further (with gdb debugging), and the current HEAD is working with Crystal. Delaying the collector fiber spawn did workaround the above issue correctly. Immix GC can allocate and collect memory. This is made evident by running spec/gc_spec.cr with the Immix GC build with -DGC_DEBUG.

Dan-Do commented 2 years ago

I actually modified many things in my own app code since then, so I don't know if the problem is in my source or not. At least it works now. Thank you for looking into this.