Cannot use "file descriptor" abstraction for sockets under Windows

chacha21 commented 2 months ago

Google protobuf offers the "file descriptor" abstraction to read/write through a FileInputStream.

Unfortunately, while under Unix those file descriptor are perfectly compatible with TCP sockets, it is not the case under Windows.

Winsock is based on a SOCKET type which is not a file descriptor (https://learn.microsoft.com/en-us/windows/win32/winsock/socket-data-type-2)

Trying to force cast the windows SOCKET to an int leads to a crash because the ::_read and ::_write functions won't work.

Could it be possible to add a "native socket" abstraction for Google protobufs that would be able to use dedicated winsock send/recv instead of _read/_write ?

I understand that I can do that myself by creating such a derived class, on my "server" side. But it would be nice to get an "official" support, so that under Windows, a "client user" would not need any custom code, only the proto file, to create his own client software.

esrauchg commented 2 months ago

I suspect it would be more straightforward to use IstreamInputStream instead of FileInputStream, then you'd just need to get a std::istream for the socket:

https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/io/zero_copy_stream_impl.cc#L228

Is it difficult for you to construct an std::istream rather than creating a derived class of ZeroCopyInputStream?

chacha21 commented 2 months ago

The problem is not for me, it is for the client software : without custom code it won't be able to read from the socket as well, and I want to minimize the friction.

esrauchg commented 2 months ago

Hm... perhaps you've gone deep on the details here but I'm unsure if ZeroCopyInputStream is itself suited to the design that seems to suggest (= multiple readers coming off the same stream).

For the normal usecase of reading a bunch of protobuf messages who have been concatenated, it will potentially read past the end of one of the messages and then call BackUp() to realign it (any backup guaranteed to be <= the last Read() size which allows for 'zerocopyinputstream' to really be a 'small buffer size copy stream' here, holding some number of bytes for the next Read() without the filesystem being aware of this read-past-then-back-up).

This permits you to parse a bunch of messages by holding the same FileInputStream and parsing, but I believe the fact that it may Read() past the actually-consumed content means you could not naively share the file_descriptor with both a protobuf FileInputStream and additionally with another reader and expect it to work (unless e.g. our impl actually does some rewind on the fd itself that I wasn't aware of).

chacha21 commented 2 months ago

I am not sure to understand, why are you talking about multiple readers ? I if set up a TCP server, each connected client would have its own socket. The problem is that under windows, it is not a file descriptor.

esrauchg commented 2 months ago

Apologies, I interpreted the "as well" part to mean you were thinking of one binary that had multiple readers on the same socket when reading:

The problem is not for me, it is for the client software : without custom code it won't be able to read from the socket as well, and I want to minimize the friction.

Stepping back, I think I wouldn't expect very many people to be using the FileInputStream class with a socket on linux regardless; though it obviously does work under the 'everything is a file' model.

I think even on linux it would be more typical to expect more complex networking layer on top of Protobuf (basically, gRPC or a replacement for it) rather not simply piping a fd directly into protobuf fileinputstream. So the benefit for x-platform SocketInputStream is less obvious from that starting point.

If you did want to listen 'directly' onto a socket I think the design I would expect is for the client code to use something else like a IstreamInputStream; it sounds like like you expect to manage a server, and for other developers to write client code to listen on a socket to receive protobuf messages you are sending, then you must publish either a small C++ library for them to use or sample code / boilerplate that you expect them to follow.

It would probably help clarify/advocate for the benefit if you could spell out a bit more:

what does that "small library or sample code" look like for you today (apparently with FileInputStream, but not compatible with Windows?)
what does it look like to adjust that library-or-sample so it constructs a std::istream for the socket and then a IstreamInputStream (which could be naturally compatible on any platform)
what would be better under a dream design of "proto::SocketInputStream"

chacha21 commented 2 months ago

Ok, let me explain in more details.

I maintain a software (let's say" the server") that computes data (e.g. 2D scalar fields) on a live image stream
I client might be interested by live-accessing those scalar fields for his own purpose
those fields are not compatible with any "standard" image format, it is not possible to encode them in any standard video stream that could be broadcasted with rtsp
my solution is to emit those fields from the server through a named pipe. The protobufs on top of that are excellent, because I only need to publish the proto file, and anyone can build a custom code, connect to the pipe, and get the data.
Thanks to the protobuf::FileOutputStream using a file descriptor, both the server and the client need 0 glue code to serialize/deserialize the data. It works directly.
Now, clients might want to access the data through the network rather than the named pipe on the host machine
Under Unix, a TCP socket is a file descriptor, so both the server and the client would need 0 additional glue code to exchange data exactly like on the named pipe. The only difference is how to get the file descriptor, everything else is the same usage of protobuf::FileOutputStream

But under Windows, a socket is not a file descriptor, so the client has to insert additional code to read from the socket and submit to the protobuf API. Even if I provide a minimal sample code, it is far less attractive because extra work is needed, and must be maintained.

And this is just because protobuf use read() and write() and has no "socket" specialization using "recv()" and "send()". It makes sense under Unix, it is a pity under Windows. And it's relatively easy to implement as a mere specialization for the core read/write of a protobuf::FileOutputStream, everything else is exactly the same logic.

As you see, the problem is not to write the code needed for the sockets to work with protobufs, there are plenty of solutions for that. My purpose is to be attractive with a solution requiring 0 additional code for the client as long as he gets the proto file.

esrauchg commented 2 months ago

I'm not hard ruling out that we would accept a contribution of a ZeroCopyInputStream subclass that uses recv() and send(), but I think I'd be a little wary that the usecase where we'd recommend that this be used is extremely narrow.

Although it inherently works due to the Linux system design of treating everything as a file, basically I think considering only Linux the scenario where we'd say a protobuf::FileOutputStream on a fd backed by a socket on Linux is a good design is only in very narrow situations, and not something we'd expect many people to do. That nuance somewhat makes the pitch for "this is already great for Linux, but needs a different thing so we can match it on Windows" a less clear argument.

Basically once you have network communication involved between clients and server which are under different people's control, you're at the point where we'd strongly recommend you bring in gRPC (or any other 'real' network service library, like Twirp, Buf Connect, Thrift, etc) which will actually have services declared and handle all of the the network communication nuances that go along with it (proxies, timeouts, streaming vs non-streaming apis, proper async support instead of just single thread blocking); generally when someone wants to do something as bare-metal as raw reading bytes from a socket with no library around, they tend to have other special needs or constraints and I'd expect them to be unafraid of the 100 lines of boilerplate.

Unless I'm missing something, even if we had the class you are imagining the client code would still involve 100+ lines of setup / error handling / looping on the new stream parsing (the winsock example code for connect() is already 64 lines long without doing any read/write). It seems that you can't really have zero if we had it, you'd probably want to publish that as a .h or sample code anyway and so its not a zero/nonzero situation but more like "the sample code boilerplate that we will need to give our potential clients is 130 lines instead of 100 lines" situation?

chacha21 commented 2 months ago

You might be right about "the sample code boilerplate that we will need to give our potential clients is 130 lines instead of 100 lines". I can understand that somebody will have to write some network code anyway, at least to handle timeouts.

Alternatively, there's also the idea of some kind of protobuf::CustomInputStream that would accept a set of callback functions dedicated to read/write. Rather than creating a subclass for the data provider (it's not funny to subclass std::istream), rather than subclassing the data parser (it's not funny to subclass protobuf::FileInputStream), a developer could focus on the very few core operations needed by the [de]serializer, minimizing the maintainance burden when the "superclass" evolves in some way.

This is something I found very handy and powerful in sqlite (https://sqlite.org/c3ref/vfs.html)

protocolbuffers / protobuf

Cannot use "file descriptor" abstraction for sockets under Windows #17219