guard / listen

The Listen gem listens to file modifications and notifies you about the changes.
https://rubygems.org/gems/listen
MIT License
1.92k stars 246 forks source link

Linux backend can hang at startup on sleep in Listen::Event::Loop#_wait_until_resumed #481

Closed ColinDKelley closed 3 years ago

ColinDKelley commented 4 years ago

We have been running listen in production for ~6 months across hundreds of k8s pods with frequent file system changes to listen for. There is a bug where sometimes the backend loop code would simply hang and never process changes. We have traced that to this code in Listen::Event::Loop:

def _wait_until_resumed(ready_queue)
  self.state = :paused
  ready_queue << :ready
  sleep
  self.state = :processing
end

That call to sleep is expecting to be awakened by this state transition:

state :processing_events, to: [:paused, :stopped] do
  processor.resume
end

which is defined as:

def resume
  fail Error::NotStarted if stopped?
  return unless wait_thread
  _wakeup(:resume)
end
...
def _wakeup(reason)
  @reasons << reason
  wait_thread.wakeup
end

The bug here is a race condition. If the wakeup runs before the thread calls sleep, there is no memory of that, so the sleep simply blocks forever in the :paused state.

I believe the proper way to fix this is to have a ::Mutex protecting the state and associated ::Condition object(s) to fire on state transitions that need to be waited for. There is no need to have the ::Queue objects ready_queue and @reasons nor any of the calls to sleep or Thread#wakeup.

ColinDKelley commented 3 years ago

Merged and ready to release in v3.3.0.