Closed ta0kira closed 2 years ago
Calling union_.value_.as_pointer_->object_->~TypeValue();
in WeakValue::Cleanup
doesn't seem to fix the problem. The issue is therefore probably due to BoxedValue::FromPointer
getting called with a value that isn't the most-derived class.
Looks like AnonymousOrder
is at least part of the problem, since it uses VAR_SELF
but isn't the most-derived class.
Still a problem after 3bd6837c0fc4e65c00068129834aaa8ecfd18537.
This actually was also a problem in 6f4a65d56c5f8add68ac88f6c7df21dbb7224e7a.
Looks like it was also an issue in 0.18.0.0
.
One possibility is that multiple threads are accessing the same BoxedValue
rvalue, rather than creating a new copy of the reference.
Capturing variables by value and excluding this
and Var_self
in ExtValue_ProcessThread::Call_start
doesn't seem to change anything. (Making changes to 0.18.0.0
for testing.)
After adding reference validation to LeakTest
:
tests/leak-check/LeakTest: Failed condition: Leaked LeakTest checkedValue at 0x1e64680 (S: 0 W: 2)
From LeakTest.doIteration at line 71 column 5 of "tests/leak-check/leak-check.0rx"
From LeakTest.runWith at line 35 column 7 of "tests/leak-check/leak-check.0rx"
From LeakTest.run at line 19 column 7 of "tests/leak-check/leak-check.0rx"
From main
One weak reference is from checkedValue
and the other is (probably) from BoxedValue
skipping the cleanup step.
So far all of the errors from SimulateRefs
have this exact order of operations:
Error in final state: object not cleaned up after last reference
Final state: S: 0 W: 1 AC: 0 WC: 0
weak1[ToStrongEnter]: S: 1 W: 3 AC: 0 WC: 0
shared1[DropSharedEnter]: S: 4097 W: 3 AC: 0 WC: 0
weak2[ToStrongEnter]: S: 4096 W: 3 AC: 0 WC: 0
weak1[ToStrongAlive]: S: 8192 W: 3 AC: 0 WC: 0
weak1[ToStrongDiscardWeak]: S: 4097 W: 3 AC: 0 WC: 0
weak1[DropSharedEnter]: S: 4097 W: 2 AC: 0 WC: 0
weak2[ToStrongDead]: S: 4096 W: 2 AC: 0 WC: 0
weak2[ToStrongCheckWeak]: S: 0 W: 2 AC: 0 WC: 0
The obvious problem seems to be that cleanup on the strong reference starts before the lock is subtracted for the unsuccessful second strong
operation. (I don't think that would have ever occurred to me.)
The solution was to use a proper lock for the two places that the fake lock was being used, and actually make it a spinlock.
WeakValue
constructor for BoxedValue
.BoxedValue::Cleanup
.Therefore, it's sufficient to just have the counters be atomic for all other uses, and use a spinlock for both cases above.
I went with the fake lock initially because I'd assumed that a spinlock would need to be used everywhere, rather than just in those two places. Using a spinlock everywhere made performance worse than just using std::shared_ptr
, and the fake lock fixed that performance problem.
tests/leak-check/LeakTest forever
shows that there's a slow memory leak that didn't exist inv0.18.1.0
. The rate seems to be about 1 object per ~300 iterations, which means that the leak is unlikely to actually occur in programs that aren't specifically designed to cause race conditions.The issue seems to not be present as of 51f75bd087f60b640893f37c17da0097d590588f (single allocation for
BoxedValue
counters and object) but present as of e058732b249d8b5e786eac89ece07af9455da067 (switch fromVar_self
toVAR_SELF
.)It's almost certainly a race condition because
valgrind
doesn't detect any memory leaks.