socketry / io-event

MIT License
66 stars 15 forks source link

Open fifo blocks other fibers (macOS) #86

Open jpcamara opened 8 months ago

jpcamara commented 8 months ago

A core ruby test in make test-all is failing with MN threads on macOS, and so I tested it on async+io-event to see the behavior there. It gets stuck in a (seemingly) identical way. The following script hangs indefinitely, and is based on the test_open_fifo_does_not_block_other_threads thread test found in test/ruby/test_io.rb (https://github.com/ruby/ruby/blob/master/test/ruby/test_io.rb).

require "tempfile"
require "async"

def mkcdtmpdir
  Dir.mktmpdir {|d|
    Dir.chdir(d) {
      yield
    }
  }
end

mkcdtmpdir {
  File.mkfifo("fifo")

  Async do |task|
    t1 = task.async {
      open("fifo", "r") {|r|
        r.read
      }
    }
    t2 = task.async {
      open("fifo", "w") {|w|
        w.write "foo"
      }
    }
    # Hangs indefinitely
    puts t1.wait
  end
}

The "fix" i've found so far for MN threads is to change the rb_io_read_memory call:

// return (ssize_t)rb_thread_io_blocking_call(internal_read_func, &iis, fptr->fd, RB_WAITFD_IN);
return (ssize_t)rb_thread_io_blocking_call(internal_read_func, &iis, fptr->fd, RB_WAITFD_IN | RB_WAITFD_OUT);

Once it goes from RB_WAITFD_IN to RB_WAITFD_IN | RB_WAITFD_OUT it starts "working" correctly. I haven't had a chance to dig into it much yet, but since it has the same stuck issue in the fiber scheduler i'm also opening an issue here.

ioquatix commented 8 months ago

Okay I will take a look thanks for the detailed report and reproduction.

jpcamara commented 8 months ago

Yw! I'm noticing now that it never even makes it to t1.wait

mkcdtmpdir {
  File.mkfifo("fifo")

  puts "async"
  Async do |task|
    puts "t1.async"
    t1 = task.async {
      open("fifo", "r") {|r|
        r.read
      }
    }
    puts "t2.async"
    t2 = task.async {
      open("fifo", "w") {|w|
        w.write "foo"
      }
    }
    # Hangs indefinitely
    puts "time to wait"
    puts t1.wait
  end
}

All I see is

async
t1.async

and then it hangs - so it seems to get stuck just trying to even run t1 at all.

ioquatix commented 8 months ago

https://github.com/socketry/io-event/pull/91

Trying to add some tests that reproduce the issue.