Closed zonkhead closed 4 years ago
Hmm ok seems like a problem with the UNIX socket subsystem... but only on Darwin?
Can you make a simple script or test that shows the problem?
You already have it. Your UnixServer class. Invoke it with the script lines above. Maybe both socat and netcat don't work well with unix sockets on Darwin. Probably not but I can't be certain.
Ah I understand now. Will investigate.
I did not have a socat
command on my MacOS machine, so I assume you installed that separately.
With socat
(from homebrew) it appears to write 1024 bytes and then hang here:
"main" #1 prio=5 os_prio=31 tid=0x00007fa192004000 nid=0x1a03 runnable [0x0000700000b5f000]
java.lang.Thread.State: RUNNABLE
at com.kenai.jffi.Foreign.invokeL6(Native Method)
at com.kenai.jffi.Invoker.invokeL6(Invoker.java:455)
at jnr.enxio.channels.Native$LibC$jnr$ffi$1.kevent(Unknown Source)
at jnr.enxio.channels.KQSelector.poll(KQSelector.java:165)
at jnr.enxio.channels.KQSelector.select(KQSelector.java:145)
at jnr.unixsocket.example.UnixServer.main(UnixServer.java:47)
I don't see why it hangs here, but I did notice that socat
sends data in 8196-byte chunks by default. If I change that to 1024-byte blocks, it gets further... 8 blocks successfully transit the server, and then the server exits with a "Broken pipe" error indicating the client has gone away.
For my test, I used a file that's 11423 bytes long, so I would expect to see that much data transit the server.
So two questions out of this:
Suspecting this might be an interaction between jnr-unixsocket and socat I thought I'd play with the UnixClient we also have in examples.
With only 9 bytes written, it works fine.
If I modify it to send 9000 bytes, with a loop to read everything using the same 1024-byte buffer, it gets stuck after two 1024-byte buffers have been filled.
At that point, the server is in the same place it is for socat
with the client stuck here:
"main" #1 prio=5 os_prio=31 tid=0x00007ff309805000 nid=0x2303 runnable [0x00007000011c7000]
java.lang.Thread.State: RUNNABLE
at com.kenai.jffi.Foreign.invokeN3O1(Native Method)
at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)
at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(Unknown Source)
at jnr.enxio.channels.Native.read(Native.java:115)
at jnr.unixsocket.impl.Common.read(Common.java:51)
at jnr.unixsocket.impl.AbstractNativeSocketChannel.read(AbstractNativeSocketChannel.java:72)
at jnr.unixsocket.UnixSocketChannel.read(UnixSocketChannel.java:253)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)
- locked <0x000000076d9d4c78> (a java.lang.Object)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
- locked <0x000000076daa2bd0> (a sun.nio.ch.ChannelInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0x000000076daa2b40> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.Reader.read(Reader.java:100)
at jnr.unixsocket.example.UnixClient.main(UnixClient.java:55)
The client appears to be stuck reading from the server, while the server is stuck waiting for more data... even though we've dumped 9000 bytes on the wire!
Here's the patch for the client:
diff --git a/src/test/java/jnr/unixsocket/example/UnixClient.java b/src/test/java/jnr/unixsocket/example/UnixClient.java
index 3bdfc6c..aaafafc 100644
--- a/src/test/java/jnr/unixsocket/example/UnixClient.java
+++ b/src/test/java/jnr/unixsocket/example/UnixClient.java
@@ -42,6 +42,7 @@ public class UnixClient {
}
}
String data = "blah blah";
+ for (int i = 0; i < 1000; i++) data += "blah blah";
UnixSocketAddress address = new UnixSocketAddress(path);
UnixSocketChannel channel = UnixSocketChannel.open(address);
System.out.println("connected to " + channel.getRemoteSocketAddress());
@@ -51,17 +52,19 @@ public class UnixClient {
InputStreamReader r = new InputStreamReader(Channels.newInputStream(channel));
CharBuffer result = CharBuffer.allocate(1024);
- r.read(result);
- result.flip();
- System.out.println("read from server: " + result.toString());
- final int status;
- if (!result.toString().equals(data)) {
- System.out.println("ERROR: data mismatch");
- status = -1;
- } else {
- System.out.println("SUCCESS");
- status = 0;
+ while (r.read(result) > 0) {
+ result.flip();
+ System.out.println("read from server: " + result.toString());
+ result.clear();
}
- System.exit(status);
+// final int status;
+// if (!result.toString().equals(data)) {
+// System.out.println("ERROR: data mismatch");
+// status = -1;
+// } else {
+// System.out.println("SUCCESS");
+// status = 0;
+// }
+// System.exit(status);
}
}
Ok I think I have some answers. I'm not sure it's a bug, but it's an explanation of what we're seeing here.
Because the UnixClient seemed to also hang in a read, I suspected that the server was only seeing a partial view of the content. I modified the ServerActor to not just read 1024 bytes, but to read as many bytes as it can before getting a "0" return value.
The result is that the server successfully reads and writes all 9000 bytes from my modified client.
Heres the patch:
diff --git a/src/test/java/jnr/unixsocket/example/UnixServer.java b/src/test/java/jnr/unixsocket/example/UnixServer.java
index a70a924..787f4f6 100644
--- a/src/test/java/jnr/unixsocket/example/UnixServer.java
+++ b/src/test/java/jnr/unixsocket/example/UnixServer.java
@@ -104,16 +104,20 @@ public class UnixServer {
public final boolean rxready() {
try {
ByteBuffer buf = ByteBuffer.allocate(1024);
- int n = channel.read(buf);
- UnixSocketAddress remote = channel.getRemoteSocketAddress();
- System.out.printf("Read in %d bytes from %s%n", n, remote);
+ int n;
- if (n > 0) {
- buf.flip();
- channel.write(buf);
- return true;
- } else if (n < 0) {
- return false;
+ while ((n = channel.read(buf)) > 0) {
+ UnixSocketAddress remote = channel.getRemoteSocketAddress();
+ System.out.printf("Read in %d bytes from %s%n", n, remote);
+
+ if (n > 0) {
+ buf.flip();
+ channel.write(buf);
+ buf.clear();
+// return true;
+ } else if (n < 0) {
+ return false;
+ }
}
} catch (IOException ex) {
This change also fixes the socat
example; the file I pipe to it now completely transits the server. And just for completeness, I confirmed that your nc
example also completes successfully.
I think what we're seeing here is a bad interaction between IO buffers (at either the JVM or kernel level) and the poll
call used for IO select here. On the server side, it seems the poll for read is not seeing data left "on the wire" after a subsequent read event has fired. As a result, we eventually end up with some number of bytes "in limbo" and no poll events left to trigger the server to read those bytes. I don't think this constitutes a bug in jnr-unixsocket, since select
, read
, and write
all just bottom out in the system's poll
, read
, and write
native calls.
It's possible that we're not configuring the buffering for the unix domain socket file descriptor properly, but we would need to research that. We're not doing anything unusual when setting it up, so I would expect the basic unix socket to work properly with poll
.
I will commit this change to UnixServer for you to test. I am not entirely satisfied with this as a "solution" so perhaps you can help me figure out why we're seeing this buffering behavior?
With the UnixServer working properly now on Darwin, I'm going to close this issue.
From discussions and articles online, it appears this may be just one of the "quirks" of using poll across platforms. It does not appear that additional POLL_IN events get triggered for unread data that happens to be lying around in a kernel buffer, so code that responds to a READ select should attempt to read as much data as is available before doing another select.
Releasing today in 0.29.
I'm trying to make a server using the UnixServer example as a starting point. On my Mac, when I send more than two buffers worth to it, it just hangs and no longer reads more data. Here are two example commands that should work (and do work on Linux).
➜ ~ cat .emacs | socat UNIX-CONNECT:/tmp/fubar.sock -
➜ ~ cat .emacs | nc -U /tmp/fubar.sock
All I'm doing is running the UnixServer class straight out of the box. I'm using version jnr-unixsocket 0.28.
The funny thing is that if I make the ByteBuffer smaller than 1024 (like 512), it hangs after just 1 buffer read. All buffer sizes work fine on Linux.