ivmai / bdwgc

The Boehm-Demers-Weiser conservative C/C++ Garbage Collector (bdwgc, also known as bdw-gc, boehm-gc, libgc)
https://www.hboehm.info/gc/
Other
2.98k stars 407 forks source link

disclaim_weakmap_test fails if FIND_LEAK #252

Closed ivmai closed 5 years ago

ivmai commented 5 years ago

Source: latest master Host: Ubuntu/x64 How to reproduce: ./configure --enable-werror --enable-gc-assertions --disable-parallel-mark --disable-thread-local-alloc && make -j check CFLAGS_EXTRA="-D FIND_LEAK"

The error in disclaim_weakmap_test.log is either Assertion failure, line 407: pair_new(p0, p1) == pop[t] Or pthread_mutex_trylock: Invalid argument

The issue is also reported in #250. The test was added in pull #239.

+ @paurkedal

paurkedal commented 5 years ago

The assertion failure is most likely due to GC_print_all_errors calling GC_free on objects which should have been kept according to the disclaim callback. I haven't though about how to deal with it yet.

The EINVAL issue is more obscure. The failures seem to happen only from the main thread, while subthreads are happily locking and unlocking. According the the pthread_mutex_lock POSIX man page, the cause of EINVAL is related to thread priority when using the PTHREAD_PRIO_PROTECT mutex protocol attribute. However, the default is PTHREAD_PRIO_NONE, and we're not setting it. The man page also suggests that EINVAL may be returned if the argument is invalid, as one might expect. But if that was the case, the calls in subthreads should have failed, as well.

ivmai commented 5 years ago

The issue is caused by GC_gcollect invoked at-exit (after weakmap is destroyed). It's easy to fix, I'll do it.

ivmai commented 5 years ago

Fixed by commit 20cfa75

paurkedal commented 5 years ago

Thanks!

ivmai commented 5 years ago

It looks like it is not fixed completely: https://travis-ci.org/ivmai/bdwgc/jobs/458917783 failed with: Checksum failure for 0x2af5d4fff1e8 = (0x2af5d4fff1e8, 0x2af5d4fdd6c8) at 413

paurkedal commented 5 years ago

Also I still get the assertion failure occasionally, so yes, seems like only the pthread error was solved. I think the cause for the remaining errors are more clear though:

GC_free on objects from the weak map will invalidate objects referring to them, and this is what happens in leak detection mode, as the disclaim notifier is not consulted. This can cause two identically hash-constructed objects ending up two different live pointers, as directly apparent from the assert failure. It can also cause the checksum failure of a referrer when the referred object is re-initialized after GC_free.

A solution would be 1) to check the disclaim callback also in GC_reclaim_check. Though I don't think using disclaim notifiers in conjunction with FIND_LEAK is useful, so we could also 2) point out in the documentation that explicit calls to GC_free and leak detection mode will not consult disclaim notifiers. For the latter solution we would skip over this unit test for leak detection mode.

ivmai commented 5 years ago

Got it. I think we could choose the 2nd variant or a kind of this (this seems to be consistent with the fix in #162). Probably GC_finalized_malloc could return null in this case. Do you have time to make a patch, please? (if not, I think I could do it).

paurkedal commented 5 years ago

I also preferred the 2nd option. I'll send a PR shortly.

ivmai commented 5 years ago

Fixed in master in somewhat different way - skipping registration of proc (in GC_register_disclaim_proc) and skipping hiding of pointers in the test.