capnproto / capnproto-java

Cap'n Proto in pure Java
Other
391 stars 86 forks source link

Serialize#read is too eager to throw on EOF #123

Closed jrsala-auguration closed 1 year ago

jrsala-auguration commented 2 years ago

I would like to generically read messages from some unspecified ReadableByteChannel, because that's what Serialize#read and SerializePacked#read accept so I write

static void processMessages(ReadableByteChannel channel) {
  // Loop until EOF
  while (/* ??? */) {
    processSingleMessage(Serialize.read(channel));
  }
}

My problem is that I think due to the way Serialize#read is implemented, there is no way to write the loop condition above to detect the end of the stream, meaning an exception is unavoidable. If Serialize#read is called one too many times, it will throw from Serialize.java:58 (throw new IOException("premature EOF");). I can't write while (channel.isOpen()) because the channel is not going to close() itself magically at EOF.

From what I understand ReadableByteChannel tells you EOF is reached once read returns -1. I can't call read myself because that would consume a byte that I can't put back. Here Serialize#read is driving the ReadableByteChannel so it should be its responsibility to handle EOF correctly. I can understand that it throws if EOF is encountered in the middle of a message but unfortunately it also throws on EOF at message boundaries when it expects the start of a message.

There is a stopgap solution which is to create a ReadableByteChannel that wraps the original one and buffers a byte between reads in order to detect EOF, but I'd rather not do that.

I think Serialize#read and SerializePacked#read should not throw when their first call to ReadableByteChannel#read returns -1. Instead they should return null, or their return type should be changed to be more informative.

I'm not very experienced with Java so please do enlighten me if there's an easy solution I've missed.

dwrensha commented 2 years ago

I agree. We should add something like capnproto-rust's try_read_message(). https://github.com/capnproto/capnproto-rust/blob/267b8b8939fb63b1765741617b98d1029466c917/capnp/src/serialize.rs#L258-L268

paxel commented 2 years ago

while thinking about the Serialize class: about three years ago you introduced the Allocator to allow more flexibility in where the memory comes from, and while I could make some workarounds to trim the netty ByteBufs to the Bytebuffer sizes after getSegmentsForOutput, it was not possible to use the allocator for the framing header.

for me the Serialize is not a good solution. it is useful for the streaming in / out. but if you want to have the data of a frame without copy, you have to write your own frame header code.

for the eof exception at the beginning of the message I wrapped a MeteredChannel (that counted the bytes read so far) and reseted it after each capnp message. when an exception was caused and read bytes was 0 it was ignored and returned null.

I think a complete redesign of the Serialization of the capnp Java, and with that depreciating Serialize, is necessary and maybe we should collect issues ppl have and try to find a solution that gets rid of them all.

Ryker out :)

David Renshaw @.***> schrieb am Mo., 21. Feb. 2022, 22:04:

I agree. We should add something like capnproto-rust's try_read_message().

https://github.com/capnproto/capnproto-rust/blob/267b8b8939fb63b1765741617b98d1029466c917/capnp/src/serialize.rs#L258-L268

— Reply to this email directly, view it on GitHub https://github.com/capnproto/capnproto-java/issues/123#issuecomment-1047226620, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLHGHY375G6AR5VZDC3X3U4KSENANCNFSM5O7MSIMA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>