proposal: net/v2: make Pipe asynchronous

dsnet commented 6 years ago

Currently, net.Pipe is documented as such:

Pipe creates a synchronous, in-memory, full duplex network connection; both ends implement the Conn interface. Reads on one end are matched with writes on the other, copying data directly between the two; there is no internal buffering.

This behavior occurs because net.Pipe was just a thin wrapper over io.Pipe, which is implemented as being synchronous. However, most network connections are not synchronous as they involve buffering at multiple layers (in the OS, on the wire, etc). Thus, we should switch net.Pipe to act in an asynchronous fashion to better emulate true network connections.

hnsl commented 6 years ago

I will directly add to this bug that the current implementation is so surprising that it violates the io.Reader documentation: "Implementations of Read are discouraged from returning a zero byte count with a nil error" and I don't see a way of fixing that without stop matching writes and reads 1:1. For n > 0 zero writes, either you need to not send zero writes (n:0) or you need to read writes until you get a non-zero write (n+1:1).

slrz commented 6 years ago

Could you provide some additional rationale for this proposal? Why do you want net.Pipe to better emulate "real" network connections?

It might actually be beneficial to have net.Conn implementations that differ from the most commonly used ones while still satisfying the net.Conn contract, to avoid "all the world is a TCP socket" assumptions from creeping in. Not sure whether that has to be net.Pipe, though.

dsnet commented 6 years ago

Why do you want net.Pipe to better emulate "real" network connections?

The most common use-case of net.Pipe is for testing. A test that is non-representative of the real-world is a poor test. It's not even an issue of "all the world is a TCP socket". Even Unix domain sockets present some form of buffering.

Suppose we were adding net.Pipe for the first time, I would still be asking for justification why this should be synchronous.

hnsl commented 6 years ago

Note: This issue IMO is not that it's buffered or "synchronous". I think it makes sense to have a non-buffered synchronous net.Pipe for simplicity, but have no strong opinion about it. If you want buffering you can always glue some buffering onto the write interface.

The issue is that the current implementation is synchronous in a way that is very unhelpful and confusing. What does a pipe synchronize? It synchronizes data - bytes. Somehow the implementation are currently synchronizing something completely different: read and write operations. Common sense dictates that a pipe should block a read if there is no data to read and block a write if the write is not done yet. Essentially: we block when there is something remaining to do, otherwise we return to the caller.

But surely synchronizing read and writes is the same as synchronizing their data? Incorrect, when you write a zero length buffer (which is completely valid and should be a no-op) you now have to synchronize that operation with a read operation and return a zero length string - and block the write if there's no read operation to synchronize with. This breaks this critical assumption about how a reader and writer should behave.

For the same reason ("h" + "" + "ello" + "wo" + "" + "rld" + "") is the same as "hello world", write("h"); write(""); write("ello"); write("wo"); write(""); write("rld"); write("") should have the same effect as write("hello world") (on the writer side). Anything else is just really surprising. For me this caused a deadlock that took significant time to debug.

The simplest way to fix this in the current net.Pipe is to simply ignore zero length writes instead of blocking, a one liner change. The argument against that is that the documentation somehow forbids this and this is therefore a 2.0 feature. I disagree with the following rationale:

io.TeeReader has equivalent documentation and does discards zero length buffers without matching those reads to "corresponding writes".
There is no "1:1 mapping of reads to writes". One write operation maps to some N number of reads already. Why can't N be zero? The documentation doesn't forbid it as far as I can tell.

slrz commented 6 years ago

@hnsl To be clear, the net.Pipe behaviour does not violate the io.Reader interface contract. Returning (0, nil) might be discouraged, but is still perfectly legal.

hnsl commented 6 years ago

Agree, and I make no claim on the contrary.

takeyourhatoff commented 6 years ago

A while ago I was implementing a symmetric network protocol, and used net.Pipe to test it. The lack of buffering caused a deadlock. If net.Pipe did buffering, I probably would not have noticed that my implementation depended on the underlying connection buffering at least n bytes.

bcmills commented 6 years ago

This issue IMO is not that it's buffered or "synchronous". … The issue is that the current implementation is synchronous in a way that is very unhelpful and confusing.

Agreed 100%.

The simplest way to fix this in the current net.Pipe is to simply ignore zero length writes instead of blocking, a one liner change. The argument against that is that the documentation somehow forbids this and this is therefore a 2.0 feature.

Part of the problem is that the semantics of net.Pipe should pretty clearly match the semantics of io.Pipe, which does document this behavior clearly: “Reads and Writes on the pipe are matched one to one except when multiple Reads are needed to consume a single Write. That is, each Write to the PipeWriter blocks until it has satisfied one or more Reads from the PipeReader that fully consume the written data.”

I would argue that we should change both io.Pipe and net.Pipe to drop 0-length writes in Go 2, and keep them both synchronous. That would remove the discouraged (and confusing) 0-length Read behavior, but keep net consistent with io.

bcmills commented 6 years ago

most network connections are not synchronous as they involve buffering at multiple layers

Not every network connection is asynchronous. For example, I believe that implementations of UNIX domain sockets typically do not induce any additional buffering: writes can be copied immediately into the read buffer.

Moreover, “true” network connections often involve other effects (such as MTUs) that can affect read chunking. Fundamentally, if you want to make sure your program works with a particular kind of socket, you need to test it with that particular kind of socket: a fake that is more realistic in one dimension may only hide unrealism in another.

dfawley commented 6 years ago

FWIW, I wrote bufconn for grpc's tests and benchmarks, which you can use in accordance with our license: https://godoc.org/google.golang.org/grpc/test/bufconn

And also a latency package for realistically simulating bandwidth, latency, and MTU effects: https://godoc.org/google.golang.org/grpc/benchmark/latency

cbeuw commented 4 years ago

FYI I had to use use something like this to write integration tests for a TCP multiplexer and a proxy tool, so I rolled one up: https://github.com/cbeuw/connutil

It's also got some other utilities, like Dialer and net.Listener that Dial/Accept ends of pipes

golang / go

proposal: net/v2: make Pipe asynchronous #24205