socketry / async

An awesome asynchronous event-driven reactor for Ruby.
MIT License
2.04k stars 85 forks source link

Sometimes @selector is not set in Async::Scheduler::run_once!, causing NoMethodError #324

Closed paddor closed 2 days ago

paddor commented 3 days ago
NoMethodError: undefined method `idle_duration' for nil, Backtrace:
[
    [ 0] "/home/user/.gem/ruby/3.3.1/gems/async-2.12.1/lib/async/scheduler.rb:319:in `run_once!'",
    [ 1] "/home/user/.gem/ruby/3.3.1/gems/async-2.12.1/lib/async/scheduler.rb:283:in `run_once'",
    [ 2] "/home/user/.gem/ruby/3.3.1/gems/async-2.12.1/lib/async/scheduler.rb:359:in `block in run'",
    [ 3] "/home/user/.gem/ruby/3.3.1/gems/async-2.12.1/lib/async/scheduler.rb:353:in `handle_interrupt'",
    [ 4] "/home/user/.gem/ruby/3.3.1/gems/async-2.12.1/lib/async/scheduler.rb:353:in `run'",
    [ 5] "/home/user/.gem/ruby/3.3.1/gems/async-2.12.1/lib/kernel/sync.rb:26:in `Sync'",
    [ 6] "/home/user/dev/foo/lib/foo/actors/base.rb:65:in `run_reactor'",
ioquatix commented 3 days ago

Hmm, that's weird. I'll take a look.

paddor commented 3 days ago

I'm porting our core library from EventMachine to Async. Maybe Async is not the problem. I only see this error after sending SIGINT.

ioquatix commented 3 days ago

It should be robust but this is a tricky edge case. Are you able to share more of your code?

paddor commented 3 days ago

I don't think so. For now I've wrapped the load computation in that method in if @selector. I've just started porting, so there are still other problems. I'll keep porting/investigating, and come back.

ioquatix commented 3 days ago

Okay great... I'll see if I can figure out what's going on.

paddor commented 3 days ago

The port of the core library is mostly done. Apps using it can start, but the patch I made in async is still necessary.

ioquatix commented 2 days ago

Are you closing the scheduler from within an async block?

ioquatix commented 2 days ago
            begin
                @selector.select(interval)
            rescue Errno::EINTR
                # Ignore.
            end

            @timers.fire

            # Compute load:
            end_time = Async::Clock.now
            total_duration = end_time - start_time
            idle_duration = @selector.idle_duration

Between @selector.select(interval) and @selector.idle_duration you must have some how caused #close to be invoked... is that possible in your code?

paddor commented 2 days ago

Yeah, an Async-block was used to Async::Task#stop the final task.

So in this multi-process, actor model app, I used to do Async { actor.shutdown } in the SIGTERM handler of the child processes. This used to be wrapped in EM.add_timer(0) to avoid mutex interaction in trap context from Ruby's Logger (used in actor.shutdown). I've now moved away from the stdlib Logger, which means it works in trap context. That means there's no more Async { ... } block in the signal handler. It works fine now.

I'm still getting used to the very different thought model. I'm still too used to EM.

ioquatix commented 1 day ago

Generally speaking, you should avoid using traps in Async code. The event loop uses Ruby's normal trap mechanisms to convert them into exceptions which are raised at predictable times.