Open kpamnany opened 6 days ago
Simple example from the original issue report in July 2023 (would have been an older julia version)
function example1()
lock = ReentrantLock()
@sync begin
for i in 1:10000
Threads.@spawn begin
@lock lock begin
sleep(0.001)
end
end
end
end
end
On an XL/32 this runs for 25s at 3200% cpu for the duration and does 535k allocations (20MiB). On my macbook (8 cores, running with julia -t 8) it runs for 25 s at 400% - 750% cpu and does 39.8M allocations (1.2GiB !!) (Note: if you comment out the
@lock
but leave the sleep, on my macbook it runs in 0.03s and does 100k allocations).
unlock(rl::ReentrantLock) does a notify(cond) which wakes all tasks waiting on the lock. As this is a lock, only one of these tasks can succeed, so it is unnecessary to schedule all of them. When there are a large number of tasks, the current approach wastes CPU time and can cause a quadratic amount of allocations for waitlist nodes.
It isn't straightforward to trivially fix this by changing the
notify
call to setall=false
--unlock
will only callnotify
ifReentrantLock.havelock
is0x02
(see here) and that only happens after the firstnotify
if all waiting tasks are woken up to race for the lock again (see here).Broken out from https://github.com/JuliaLang/julia/issues/50425 as this is a separate issue.