Open qinsoon opened 2 weeks ago
donenotify
is GenericCondition{SpinLock}
. It is moved by the GC, but the code stills accesses the old reference, which ends up with a value of 0s.
[2] unlock
@ ./locks-mt.jl:66 [inlined]
[3] unlock(c::Base.GenericCondition{SpinLock})
@ Base ./condition.jl:74
[4] _wait(t::Task)
@ Base ./task.jl:311
[5] wait
@ ./task.jl:347 [inlined]
So if we pin donenotify
in jl_new_task
, this issue disappears. I am not sure how the invalid reference is from though.
First seen in https://github.com/mmtk/mmtk-julia/actions/runs/11097983519/job/30830015431?pr=170 in https://github.com/mmtk/mmtk-julia/pull/170 when running
test/threads
.The actual error message is
unlock count must match lock count
when we try to set an atomic int inSpinLock
to 0, but found the old value was already 0.As this is related with copying, it is likely that the
SpinLock
accessed was moved. In our debug build, we zero the memory for a moved object. So if we follow the old reference, we will find an object with 0.I am not sure how this is possible, as the
SpinLock
is thedonenotify
field ofTask
, and it should be traced.This is undeterministic.