Open hugopl opened 4 years ago
Ah, additional info, I was using the experimental branch of jhass/crystal-malloc_pthread_shim
.
malloc_pthread_shim:
github: jhass/crystal-malloc_pthread_shim
branch: track_alignments
If I use the master branch it works... and with this information I start to believe that this issue should be filled against jhass/crystal-malloc_pthread_shim, not here, hehe
Uh. How exhausting. I'll have to see when I have more motivation to look into this to be honest :P
Uh. How exhausting. I'll have to see when I have more motivation to look into this to be honest :P
I understand, this isn't simple stuff and easily keep someone busy for days.
~I'm going to try the approach of compile a patched GLib and link against it, replacing g_malloc, g_realloc, etc.. with the GC_MALLOC_ATOMIC and friends to see if I have some success.~
IMO this isn't even a bug in your generator, but a lack of support for C libraries using they own allocators, etc from Crystal side.
I spent some time today trying to fix this like, my approach was:
Something like:
// INIT_FIX just init all these real_* pointers and put then in thread local variables.
int GC_pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *args) {
INIT_FIX;
inside_gc = 1;
int res = real_GC_pthread_create(thread, attr, start_routine, args);
inside_gc = 0;
return res;
}
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *args) {
INIT_FIX;
if (inside_gc)
return real_pthread_create(thread, attr, start_routine, args);
else
return GC_pthread_create(thread, attr, start_routine, args);
}
I got happy for few moments... because it seems to work... but not, still getting crashes.
Seems GCJ (java gcc compiler) had this very same problem 14 years ago https://www.redhat.com/archives/fedora-devel-java-list/2006-January/msg00002.html
Another approach I did was to start from glib functions, and set a flag to know if I was inside a glib function that calls pthread... to then call the GC version, got fewer crashes, but didn't work anyway.
So my question is, why the shim need to touch the malloc and friends? Is not this just a problem about glib not tell GC that there's a new thread in town?
BTW, the shim doesn't work with -Dpreview-mt
.
Enough for today, but I found something that seems interesting for this issue:
https://github.com/kubo/plthook
I'm going to try this next time I stop to try to fix this issue.
So my question is, why the shim need to touch the malloc and friends? Is not this just a problem about glib not tell GC that there's a new thread in town?
As I understand it bdwgc allocates using sbrk
, so it needs to be made aware of any malloc
ed memory to not clobber it. Wrapping the phtread
stuff is merely for registering the new stack to scan for live objects.
While https://github.com/ivmai/bdwgc/blob/6d4517c20d33ee6924dbc0d36ba3f8b358d1703e/doc/README.linux#L31-L47 does not mention malloc
and I cannot find an explicit statement that all malloc
calls should be GC_malloc
calls, there's several hints towards that:
The README has this snippet quite prominently:
#define malloc(n) GC_malloc(n)
#define calloc(m,n) GC_malloc((m)*(n))
There's this note: https://github.com/ivmai/bdwgc/blob/6d4517c20d33ee6924dbc0d36ba3f8b358d1703e/include/gc_pthread_redirects.h#L27-L34
There's this macro: https://github.com/ivmai/bdwgc/blob/b6d93860a38275f4251929d3da7361b5e1419655/doc/README.macros#L223-L238
So yeah, I'm not sure about that but iirc I was still getting crashes or clobbered memory when just redirecting pthread functions on our original case.
I just released my pet project, posting just to say thank you very much for the fast responses/fixes on all issues I reported, this made the my project exists... and in Crystal :-).
The 0.1.0 release is shipping with GC disabled by default (sic), I plan to just implement few more things and then back to this issue. Last time I tried to mimic what inkscape does before init GC but just got some beautiful crashes at startup...
Inkscape set these flags:
LibGC.set_no_dls(1)
LibGC.set_all_interior_pointers(1)
I thought about create this issue at
malloc_pthread_shim
shard, but as it's a fix for this one, I created here, please correct me if this isn't the right place.To reproduce the crash:
The crash doesn't happen with GC disabled. I tried to reduce the code to create a minimal example, but seems that it was not enough entropy to trigger the bug.
If I remove the
require "malloc_pthread_shim"
at src.main.cr it works... but crash later onGtkSource::Language.set_language
as expected.If I comment lines 66 and 68 (some signal connections) at src/application.cr it works... but probably just a coincidence.... anyway the crash happens inside the setup_actions.
Here's the output with the following libraries compiled as debug:
exitcode 139.
GDB information about the crash:
Thread 9 backtrace:
Thread 1 backtrace:
The other threads were just waiting...
Environment: Archlinux, crystal 0.35, and debug versions of following Gtk packages:
glib2 2.64.2-1
,gtk3 1:3.24.20-1
,gtksourceview4 4.6.0-1
.