ivmai / bdwgc

The Boehm-Demers-Weiser conservative C/C++ Garbage Collector (bdwgc, also known as bdw-gc, boehm-gc, libgc)
https://www.hboehm.info/gc/
Other
2.98k stars 407 forks source link

Marking regions as uncollectable #158

Closed ckaran closed 7 years ago

ckaran commented 7 years ago

I'm having an issue involving battling garbage collectors. Specifically, I'm binding python to bdwgc via ctypes. Python is internally garbage collected, but via its own garbage collector (not bdwgc). The C types that it creates are managed by the python garbage collector. This becomes an issue when bdwgc sees the regions, and attempts to walk them for collections, as either the python garbage collector ends up trying to double free() a region, or bdwgc does. In either case, I get a segfault. Is it possible to mark regions of memory as uncollectable? Those regions may have pointers to memory that is managed by bdgwc, so they need to be scanned, but the regions allocated by python cannot be deallocated by bdwgc.

ivmai commented 7 years ago

If I understand correctly, you should use GC_MALLOC_UNCOLLECTABLE. Sorry for late response.

PS. It is better to ask such questions on Stackoverflow: https://stackoverflow.com/questions/tagged/boehm-gc

ckaran commented 7 years ago

It turned out that the issue wasn't what I thought it was. The problem is that the garbage collectors don't walk each other's heaps, so if I had a pointer on the python heap pointing to a block of memory on the BDWGC heap, then the BDWGC collector would never see the pointer, and it would eventually collect the memory. Changing the C library to use GC_MALLOC_UNCOLLECTABLE() everywhere would have been difficult, and would have likely lead to memory leaks, so I did the following:

1) Turn off the BDWGC collector via GC_disable(). 2) Allocate a block of memory using GC_MALLOC(), and store the pointer in a variable p. 3) Allocate a small uncollectable block via GC_MALLOC_UNCOLLECTABLE(), and store p in there. 4) Tie the lifetime of the uncollectable block to some python object's lifetime. 5) Turn BDWGC back on via GC_enable().

BDWGC can scan the uncollectable blocks, so by storing p in those blocks, I'm guaranteed that objects in BDWGC-controlled memory won't be reaped. Tying the uncollectable blocks to the lifetime of the python objects meant p was valid for at least as long as the python objects were, which ensured that I never encountered stale pointers on the python side.

I know that this is probably not the best way of doing things, but it works reasonably well for my purposes.