Closed elcritch closed 1 year ago
Here's an example failure mode I think
But how could 3 threads access the same object with an RC == 1? Your example makes no sense.
D'oh! That's why I shouldn't try creating an example late at night. I'll try and make another. Or maybe just add a test to verify it works.
Ok, I re-worked an example and remembered to keep the total count 😆
count = 2
p.val.counter.load(Acquire)
in parallel fetchSub(p.val.counter, 1, Release)
fetchSub
will be strictly ordered so count will be 1, then 0So not a double free, but no one ever frees the SharePtr.
Remember the count is stored as "natural count - 1" so if both threads use a sharedptr the count should be 1, not 2. So no wonder it leaks in your scenario as you still don't count like the code does.
It doesn't matter if it's the natural count - 1, as I used a "logical count". In the actual base they'll both still just get count == 1
and still leak.
Using natural count - 1:
Ok, created an example (#46) that on my machine gives a few cases where the SmartPtr isn't freed. You can run it without the os.sleep
but it requires higher numbers, but still happens 1 out of a million or so.
└─(15:53:58 on test-smartptrsleak âœ)──> nim c -r "/Users/jaremycreechley/pr
ojs/nims/figuro/vendor/threading/tests/tsmartptrsleak.nim"
Hint: used config file '/Users/jaremycreechley/.asdf/installs/nim/2.0.0/config/nim.cfg' [Conf]
Hint: used config file '/Users/jaremycreechley/.asdf/installs/nim/2.0.0/config/config.nims' [Conf]
Hint: used config file '/Users/jaremycreechley/projs/nims/nim.cfg' [Conf]
Hint: used config file '/Users/jaremycreechley/projs/nims/figuro/nim.cfg' [Conf]
Hint: used config file '/Users/jaremycreechley/projs/nims/figuro/config.nims' [Conf]
Hint: used config file '/Users/jaremycreechley/projs/nims/figuro/vendor/nim.cfg' [Conf]
Hint: used config file '/Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/nim.cfg' [Conf]
Hint: mm: arc; threads: on; opt: none (DEBUG BUILD, `-d:release` generates faster code)
38527 lines; 0.022s; 46.832MiB peakmem; proj: /Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/tsmartptrsleak.nim; out: /Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/tsmartptrsleak [SuccessX]
Hint: /Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/tsmartptrsleak [Exec]
freeCounts: got: 9990 expected: 10000
/Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/tsmartptrsleak.nim(51) tsmartptrsleak
/Users/jaremycreechley/.asdf/installs/nim/2.0.0/lib/std/assertions.nim(41) failedAssertImpl
/Users/jaremycreechley/.asdf/installs/nim/2.0.0/lib/std/assertions.nim(36) raiseAssert
/Users/jaremycreechley/.asdf/installs/nim/2.0.0/lib/system/fatal.nim(53) sysFatal
Error: unhandled exception: /Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/tsmartptrsleak.nim(51, 1) `freeCounts.load(Acquire) == N` [AssertionDefect]
Error: execution of an external program failed: '/Users/jaremycreechley/projs/nims/figuro/vendor/threading/tests/tsmartptrsleak'
This bug also affects mm:atomicArc
!
This bug also affects
mm:atomicArc
!
That's a great memory leak to squash!
Should I create a PR with a fix for it?
Ok, updated my PR to include the fix. I'll checkout the Nim atomicArc too.
edit: Ok jk I'll leave the atomicArc to the compiler folks -- I'd need to understand the "destructorLift" stuff it seems. 😓
@elcritch hello, is it fixed by https://github.com/nim-lang/threading/pull/46?
Yep!
I'm not sure but the destructor for
SharedPtr[T]
seems off:Here's an example failure mode I think:
thrA
callsfetchSub
setting the count to zero.thrB
andthrC
both do aload(Acquire)
at the same time.thrB
andthrC
proceed to free the shared pointer.Using a
p.val.counter.fetchSub(1, Release) == 0
seems to work and match various examples. Though you need to set the initial count to 1. The C++ shared_ptr's all appear to useswap
orcompareSwap
though.Also, I was having failures with
fetchSub(1, Aquire)
giving me inconsistent results on my M1. From what examples and docs I've you'd need to usefetchSub(1, Release)
instead. I haven't read up enough on the memory semantics to confirm it. However, I was looking into it because I kept getting what looked like incorrect atomic counts on my M1. Switching toRelease
made the issue go away.Here's what I ended up using in my own implementation: