$stdin reads are difficult to interrupt.

jordansissel commented 9 years ago

I can't seem to get $stdin to interrupt a read after being closed (closed from a separate thread). Under MRI, this works fine, but under JRuby 1.7.16 (and 1.7.13) it does not.

Basically, I have a program with two threads. One interrupts another. In this case, one thread is blocked reading from stdin and the other is attempting to interrupt that read. I can't figure out how to do this in JRuby, but was able to easily under MRI and this seems like a bug.

More specifically, this is for Logstash. Logstash has an input plugin that reads from stdin and needs to be, in some way, interrupted in order to perform a clean shutdown of the program. To date, I have found no way to do this with any combination of IO read, select, close methods and also any Thread#raise, java.lang.Thread.interrupt, etc. Bug? Feature? Not sure!

I've got no success here. Any thoughts?

I haven't tried java nio on stdin yet, nor System.stdin, nor having a special thread read from stdin and feed a queue read by the reader thread (so I can interrupt the 'reader thread' and ignore stdin's failure to quit)

Here's what I tried w/ reproducing ruby invocations:

Strategy: IO#read(1) and IO#close

# Ruby 2.1.2; 
% rvm 2.1.2 do ruby -e 't = Thread.new do p "#{RUBY_ENGINE} #{RUBY_VERSION}" => begin $stdin.read(1); rescue => e; e; end; end; sleep 1; $stdin.close; t.join'
{"ruby 2.1.2"=>#<IOError: stream closed>}

# JRuby 1.7.16
% rvm 1.7.16 do ruby -e 't = Thread.new do p "#{RUBY_ENGINE} #{RUBY_VERSION}" => begin $stdin.read(1); rescue => e; e; end; end; sleep 1; $stdin.close; t.join'
< mental timeout after 20 seconds I assumed this was unsuccessful >

I don't know what I expected (a blocked buffered read via IO#read was interrupted under MRI but not JRuby)

Strategy: IO::select loop + IO#close

# MRI 2.1.2
% rvm 2.1.2 do ruby -e 't = Thread.new do p "#{RUBY_ENGINE} #{RUBY_VERSION}" => begin 3.times { p :select => IO::select([$stdin], nil, [$stdin], 1) } ; rescue => e; e; end; end; sleep 1; $stdin.close; t.join'
{:select=>nil}
{"ruby 2.1.2"=>#<IOError: closed stream>}

# JRuby 1.7.16
% rvm 1.7.16 do ruby -e 't = Thread.new do p "#{RUBY_ENGINE} #{RUBY_VERSION}" => begin 3.times { p :select => IO::select([$stdin], nil, [$stdin], 1) } ; rescue => e; e; end; end; sleep 1; $stdin.close; t.join'
{:select=>[[#<IO:fd 0>], [], []]}
{:select=>[[#<IO:fd 0>], [], []]}
{:select=>[[#<IO:fd 0>], [], []]}
{"jruby 1.9.3"=>3}

MRI behaves correctly if we assume a close() on an fd doesn't interrupt a select() nor makes it immediately available for reading.

JRuby asserts incorrectly that $stdin is always ready for reading.

Strategy: IO#read + Thread#raise

% rvm 2.1.2 do ruby -e 't = Thread.new do p :read => begin $stdin.read(1); rescue => e; e; end; end; sleep 1; t.raise "please stop"; t.join'
{:read=>#<RuntimeError: please stop>}

% rvm 1.7.16 do ruby -e 't = Thread.new do p :read => begin $stdin.sysread(1); rescue => e; e; end; end; sleep 1; t.raise "please stop"; t.join'
< after 10 seconds I stopped waiting and killed this test >

MRI seems to allow reads to be interrupted via Thread#raise, but JRuby does not. I would propose that Thread#raise is probably not the best way to signal such an interrupt.

Strategy: IO#read + java.lang.Thread.interrupt

MRI ignored for this test. I Tried both #sysread and #read

# IO#sysread
% rvm 1.7.16 do ruby -e 'q = Queue.new; t = Thread.new do q << java.lang.Thread.currentThread; p :read => begin $stdin.sysread(1); rescue => e; e; end; end; sleep 1; p :trying; q.pop.interrupt; t.join'
:trying
<nothing, gave up>

# IO#read
% rvm 1.7.16 do ruby -e 'q = Queue.new; t = Thread.new do q << java.lang.Thread.currentThread; p :read => begin $stdin.sysread(1); rescue => e; e; end; end; sleep 1; p :trying; q.pop.interrupt; t.join'
:trying
<nothing, gave up>

I didn't have much hope here :P

Non-bugs (in my opinion)

At some point, I started wildly guessing to see what might work. Expectations were nothing and I was simply trying anything. ;P

IO#sysread(1) + IO#close. Both rubies did not interrupt sysread() after a close()
IO::select([$stdin], nil, nil, nil) + IO#close. 'select' was not interrupted on close. Maybe correct.

jordansissel commented 9 years ago

enebo commented 7 years ago

@jordansissel Is this an issue on 9.x. I imagine on linux/macos we are native and this behaves better but on windows it may still be bad?

kares commented 4 years ago

gave it a quick go, specifically (this is what LS relies upon to shutdown std input): ruby -e 't = Thread.new do p "#{RUBY_ENGINE} #{RUBY_VERSION}" => begin $stdin.read(1); rescue => e; e; end; end; sleep 1; $stdin.close; t.join'

does not behave as MRI in JRuby 9.2 ... read will hang, despite close, till a char is read from the stream (on Linux).

colinsurprenant commented 4 years ago

@kares I figured where my confusion (per my own tests $stdin was now interruptible) in logstash-plugins/logstash-input-stdin/pull/19 comes from:

In your repro above you use $stdin.read whereas in logstash we use $stdin.sysread and changing your repro to use sysread works and calling close correctly interrupts the read op.

I am trying to figure when that change actually occurred.

colinsurprenant commented 4 years ago

The change in behaviour for $stdin.sysread becoming interruptible on close occured in 9.1.15.0. I see there has been work done on IO but haven't looked for the exact change responsible for that.

headius commented 4 years ago

@colinsurprenant Thank you for the extra sleuthing!

I grabbed a thread dump from the read version, and it is indeed still sitting there happily waiting:

   java.lang.Thread.State: RUNNABLE
    at com.kenai.jffi.Foreign.invokeN3O1(Native Method)
    at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)
    at jnr.enxio.channels.Native$LibC$jnr$ffi$5.read(Unknown Source)
    at jnr.enxio.channels.Native.read(Native.java:115)
    at jnr.enxio.channels.NativeDeviceChannel.read(NativeDeviceChannel.java:76)
    at org.jruby.util.io.PosixShim.read(PosixShim.java:157)
    at org.jruby.util.io.OpenFile$2.run(OpenFile.java:1326)
    at org.jruby.util.io.OpenFile$2.run(OpenFile.java:1316)
    at org.jruby.RubyThread.executeTask(RubyThread.java:1585)
    at org.jruby.RubyThread.executeTask(RubyThread.java:1571)
    at org.jruby.util.io.OpenFile.readInternal(OpenFile.java:1389)
    at org.jruby.util.io.OpenFile.ioBufread(OpenFile.java:1734)
    at org.jruby.util.io.OpenFile.bufreadCall(OpenFile.java:1774)
    at org.jruby.util.io.OpenFile.fread(OpenFile.java:1791)
    at org.jruby.RubyIO.doRead(RubyIO.java:3094)
    at org.jruby.RubyIO.read(RubyIO.java:3067)
    at org.jruby.RubyIO.read(RubyIO.java:3049)

Further, adding logging in ioBufread shows that it never seems to be interrupted at all... there's no eventual call to waitReadable and no looping back through to try again. It simply doesn't see the close happen.

The main difference between read and sysread appears to be when they try to select on the IO. sysread does it before the read, and read does it only after a read returns zero bytes or fails. I'm not sure if this was a flawed port of the logic in CRuby, or if perhaps CRuby changed the order of things at some point (perhaps to make read interruptible, even).

So the fix may be as simple as moving the select above the read. We will want to consider performance since the select logic is heavier than just going straight into the read.

headius commented 4 years ago

Looking at the CRuby code in io_bufread, it does indeed go to read before it ever tries to select...so this needs more investigation. I am now thinking that we are not interrupting the read properly, perhaps we should be using the same logic we use to interrupt subprocess reads...

https://github.com/jruby/jruby/blob/8a9e777b9d4e8460b272293a7fca33683d75815d/core/src/main/java/org/jruby/RubyProcess.java#L957-L991

kares commented 4 years ago

there's an update on the LS side, turns out only sysread is used on stdin so the current JRuby behaviour is "sufficient" (the channel ends up interruptible). except there's a difference in Windows where sysread will still hang forever and thus dropping the work-around and just relying on JRuby's STDIN is problematic.

jruby -e "t = Thread.new do begin $stdin.sysread(1); rescue => e; e; end; end; sleep 1; $stdin.close; t.join"

my wild guess would be the channel JRuby ends up with on Windows is not-selectable? the issue of course has a broader (read) perspective but I wanted to narrow LS' use case.

ccutrer commented 2 years ago

I'm pretty sure what I'm seeing in https://github.com/rubocop/rubocop/pull/10804 is the same, or at least very similar. Running jruby 9.3.6.0 (2.6.8) 2022-06-27 7a2cbcd376 OpenJDK 64-Bit Server VM 11.0.15+10-Ubuntu-0ubuntu0.22.04.1 on 11.0.15+10-Ubuntu-0ubuntu0.22.04.1 [x86_64-linux]. It's slightly different in that the read is happening against a pipe shared with a sub-process, and it seems to only apply if the sub-process also spawns another process. I wasn't able to distill it down to a simple repro besides bundle exec rspec ./spec/rubocop/server/rubocop_server_spec.rb in that repo, but without the backticks workaround. The jstack trace is identical, though. I tried all sorts of things to close the pipe and get the read to break, but nothing worked.

headius commented 2 years ago

@ccutrer Thanks for the additional example! The only fix I can think of here would still be to use a native pipe for IO.pipe, if indeed that is the type of IO causing this to be uninterruptible. We can't safely interrupt Java's NIO pipes because such interrupts have the side effect of closing the IO stream.

jruby / jruby