timholy / Revise.jl

Automatically update function definitions in a running Julia session
https://timholy.github.io/Revise.jl/stable
Other
1.21k stars 110 forks source link

Possibility of lost events due to use of edge-triggered Condition in entr(...) #837

Open frankier opened 2 months ago

frankier commented 2 months ago

@JanisErdmanis caught a possible race-condition in Revise.entr(...)

In particular in these lines:

https://github.com/timholy/Revise.jl/blob/13a5eb7986ee1239ff938a92043c13fee04579cc/src/callbacks.jl#L163-L166

It looks like any event fired while Revise.revise() is running will be lost. I guess something level-triggered like https://docs.julialang.org/en/v1/base/parallel/#Base.Event could be used. This would assume that there is only one listener, which is probably reasonable.

frankier commented 2 months ago

I believe this is still a problem even though the file watching tasks this revision_event watching loop should typically end up running in the same thread (the file watching tasks are sticky). The reason is is that the Julia interpreter can choose to yield at any time and particularly it may yield during IO performed in Revise.revise(...).

timholy commented 1 month ago

I'm not sure there's a great solution for this short of file-locking. xref #841

frankier commented 1 month ago

I'm not sure if locking one of the places the event is fired with the revision_lock solves the issue. The problem is that for Condition, notify(...) will only wake up all blocking tasks that are blocking at that moment in time.

Imagine we have two tasks, and task 1 is running entr(...), then:


          ----------------------------------------------------------------------------------------------
Task 1    |  wait(revision_event)  |      Revise.revise()     | wait(revision_event)
          ----------------------------------------------------------------------------------------------

Task 2                             |                       |
                             notify(revision_event)  notify(revision_event)

                                                           ^ Event is lost  ^ Waiting despite changes

Another issue is that it appears Base.Condition is being used which is not threadsafe, apparently https://docs.julialang.org/en/v1/base/parallel/#Base.Threads.Condition should be used in this case.

The solution is to use something level triggered. https://docs.julialang.org/en/v1/base/parallel/#Base.Event together with autoreset seems like it should work, although requires Julia 1.8. Note that in this case we need to have only a single "listener", which should probably be fine I guess? Or are there situations where there need to be multiple listeners? In the latter case, I guess some kind of pub/sub pattern would be needed where each listener adds its own autoresetting event, and then notify notifies all of them.

timholy commented 1 month ago

Ah, I assumed it was because files were being added to revision_queue in the middle of revise running (which would also be a way for this to happen). So #846 should help, but you're right this doesn't completely solve the problem.

Soon I'll ditch support for anything less than Julia 1.10, so perhaps this can be fully fixed then.

timholy commented 1 month ago

https://github.com/JuliaLang/julia/pull/55877 should fix this, with a couple of changes to Revise. Too many deadlines now but it's on the agenda.