capnproto / capnproto-rust

Cap'n Proto for Rust
MIT License
2.06k stars 222 forks source link

Using capnproto with non-blocking input/output streams #38

Closed danburkert closed 7 years ago

danburkert commented 9 years ago

It's prohibitively difficult to use capnproto-rust with non-blocking input/output streams (Read/Write instances that can return WouldBlock. I've been looking at the codebase the last few days to see what it would take to add support for resumable serialization/deserialization. I believe by adding a WouldBlock variant to the capnproto Error type which holds intermediate state it would be possible. So for instance:

enum WouldBlockState {
    ReadPacked(..),
    Read(..),
    WritePacked(..),
    Write(..),
}

enum Error {
    Decode { .. },
    WouldBlock(WouldBlockState),
    Io(io::Error),
}

with additional read/write methods to resume reading a message after a WouldBlock error:


pub fn resume_read_message(WouldBlockState, &mut InputStream, ReaderOptions) -> Result<..>;
pub fn resume_read_message_packed(WouldBlockState, &mut BufferedInputStream, ReaderOptions) -> Result<..>;
pub fn resume_write_message(WouldBlockState, &mut OutputStream, ReaderOptions) -> Result<..>;
pub fn resume_write_message_packed(WouldBlockState, &mut BufferedOutputStream, ReaderOptions) -> Result<..>;

I want to get your thoughts on this before diving too far into the implementation.

danburkert commented 9 years ago

I've been rethinking the proposed API a bit. It would be a simpler implementation and interface if the type which is reading / writing the message can internally save intermediate state when interrupted by a WouldBlock event. Something along these lines:

struct MessageReader<R> where R: Read { .. }

impl MessageReader<R> where R: Read {

  /// Creates a new message reader from an input.
  pub fn new(read: R) -> MessageReader<R> { .. }

  /// Read the next message from the input.
  pub fn next_message() -> Result<OwnedSpaceMessageReader> { .. }
}

The key here is that when the call to the inner Read instance fails with WouldBlock, the MessageReader type automatically saves any intermediate data read, and then return the error. On the next call to next_message, the saved state is available to continue reading the message.

Packing can be accomplished by creating wrapper types that impl Read and Write which internally buffer in the case of WouldBlock.

danburkert commented 9 years ago

That scheme probably won't work for writing, though.

dwrensha commented 9 years ago

I'm excited that you are investigating this stuff!

I guess that a user who wants to read from a non-blocking stream with this interface would need to figure out on their own when it makes sense to retry? It feels to me like MessageReader is at the wrong level of abstraction to be concerned about such things. The user probably just wants to say "here's a stream, now notify me once you've parsed a message from it." Of course, to express such an interface, we're going to need something like promises.

danburkert commented 9 years ago

I looked into how the c++ implementation works, and it does use a promise based byte stream implementation, and there are two message reader/writer implementations based on whether it's async or not. I think in any case the message reader/writer will have to be aware of async.

My hope was to have the message reader/writer be able to handle WouldBlock interruptions without needing to use higher level abstractions like futures or streams, since these abstractions largely don't exist or are very new in Rust. The closest library I know of is carllerche/eventual_io, but I don't think the abstraction provided is appropriate, since it is a Stream<Bytes> instead of a Stream<u8>, although I may be misunderstanding how to use it.

Right now I'm finishing up an experimental change that gets rid of InputStream and OutputStream and replaces it with the stdlib Read and Write traits. I'll have a PR before long so you can see what that looks like. Once that is solid I'm going to try to add support for intelligent handling of WouldBlock to that.

Hoverbear commented 9 years ago

@danburkert Having Read and Write would be awesome! I'd be interested in helping with this as well!

danburkert commented 9 years ago

@Hoverbear I implemented the Read/Write change in #41, I'd definitely appreciate any feedback you have on that.

My thoughts on this issue right now are that mio/eventual_io based async support may be relatively straightforward and performant, but I'm still looking into it.

Hoverbear commented 9 years ago

I think we might be hitting this problem now in Raft. See this commit and this test output. Note even though the refresh text is simply "test" it still fails.

danburkert commented 9 years ago

OK after a long, long time thinking about this issue, I've made another pass. The result is a MessageStream that takes ownership of a stream (implements Read, Write, or both), and allows reading or writing Cap'n Proto messages. The underlying stream can return WouldBlock at any point, and reading or writing can be resumed whenever the stream is ready to go next. It does this by keeping internal state whenever the stream returns WouldBlock in the middle of reading or writing a message.

On the read side, it uses a few tricks to reduce the number of allocations necessary. Essentially, it uses a ref-counted segmented bump allocator to allocate a big chunk of bytes up front, and then loans the bytes out to individual messages as necessary. I haven't benchmarked it, but I think it's going to be very fast.

The goal of this is to make it very easy for applications that are going down the 'state machine' route of interacting with MIO to use Cap'n Proto as the message format, as the Raft library has done. The abstraction doesn't require promises/futures/coroutines, but still manages to stay fairly high-level.

One issue I'm still hitting is that, as far as I can tell, it's not possible to have a collection that can hold capnp::message::Builder instances with arbitrary Allocator types, even if the builders are boxed. It would be helpful to have this for the queue of outbound messages in MessageStream. Right now it's hard-coded to Builder<HeapAllocator>, but it would be nice to allow any allocator type that doesn't have lifetime restrictions.

MessageStream is still pretty raw; I just finished the writing bits of it, and only the reading parts are tested. My plan is to clean it up, write some examples of using it with mio, and hopefully integrate it into Raft.

dwrensha commented 9 years ago

Interesting!

One issue I'm still hitting is that, as far as I can tell, it's not possible to have a collection that can hold capnp::message::Builder instances with arbitrary Allocator types, even if the builders are boxed.

Would impl Allocator for Box<Allocator> help?

danburkert commented 9 years ago

Would impl Allocator for Box help?

I think it would fix my immediate issue, but I think a better solution would be to revisit how builders are handled. I've been reading through the Builder, Arena, Allocator, etc. code, and I think it's due for a pretty big cleanup. The internals of message reading and building are extremely, and needlessly, unsafe. The ownership chains are helplessly confusing. Looking at the message builder side of things, there is a Builder who owns a boxed Allocator and BuilderArena, with the BuilderArena having a reference back to the same Allocator. Besides being extremely confusing, this is undefined behavior. Another big red flag is that the reading code is all based on pointers and memory locations instead of relative indexes into slices, again this is needlessly unsafe. There are dozens or hundreds of calls to transmute in layout.rs, most of which could be replaced with something less powerful (most cases are pointer casts). Builder has an unsafe impl of Send, which for a while I though was totally unsafe since there are Rc and other non-threadsafe constructs internally, but I think it relies on the fact that the Builder is the only way to get at these details?

I'm pretty confident that untangling the ownership chains inside of Builder could make cap'n proto more performant, as well as safer. Right now there are a bunch of needless allocations created for every Builder. Just spitballing- what if Builder were a trait with Allocator being an associated type, and there were a concrete type per allocator type, such as HeapBuilder, ScratchBuilder, etc. This would make it easier to abstract over builders.

dwrensha commented 8 years ago

I agree that capnproto-rust has more unsafe code than it needs to. We should work on cleaning things up.

One thing we could do is to adjust the fields of private::layout::StructBuilder<'a>. Instead of a *mut SegmentBuilder field, there could maybe be a &'a mut [Word], a &'a mut Allocator, and a &'a mut BuilderArena field. I'm a bit worried about increasing the size of StructBuilder, because it's intended to be cheaply pass-by-move, and I've actually measured performance degradations when I've attempted to increase its size. However, it just occurred to me that switching the data and pointers fields from being raw pointers to being u32 indices into a slice would regain some of the space, so maybe it's not so bad. It's also my undstanding that "passing around largish structs by move" is going to be a common pattern in rust, so maybe in the future there will be better compiler optimizations that apply here.

Besides being extremely confusing, this is undefined behavior.

I'd like to understand how the current scheme could lead to undefined behavior. Can you point me to documentation about that? Note that any writes to the the allocator go through the reference held by the BuilderArena.

Builder has an unsafe impl of Send, which for a while I though was totally unsafe since there are Rc and other non-threadsafe constructs internally, but I think it relies on the fact that the Builder is the only way to get at these details?

A message::Builder<A> should be Send if A is Send. The scary non-Send stuff only happens during a mutable borrow of the message::Builder, as through init_root().

dwrensha commented 8 years ago

Note that any writes to the the allocator go through the reference held by the BuilderArena.

Oops, I was wrong about this. https://github.com/dwrensha/capnproto-rust/blob/e9c33dc3ac1d443a6622196da2d8e7da0251f3e7/src/message.rs#L24

However, I'm still interested in learning more about how this could cause undefined behavior.

danburkert commented 8 years ago

I agree that capnproto-rust has more unsafe code than it needs to. We should work on cleaning things up.

Excellent. I'm going to open a PR in a minute with a bunch of surface level changes, but it does include getting rid of most of the calls to transmute in layout.rs.

I'm a bit worried about increasing the size of StructBuilder, because it's intended to be cheaply pass-by-move, and I've actually measure performance degradations when I've attempted to increase its sense.

Could you explain a little bit about what StructBuilder does? In looking over the internals I think there are a few places where struct sizes could be reduced (for example Builder could be collapsed from (Box<BuilderArena>, Box<Allocator>) to Box<(BuilderArena, Allocator)>).

I'd like to understand how the current scheme could lead to undefined behavior.

So if I follow the ownership model correctly, each message Builder owns a boxed Allocator instance, as well as a boxed BuilderArena. The BuilderArena has a const reference to the allocator. The Rust compiler wouldn't normally allow this, but transmute is used to construct it. Rust has stricter requirements around references than c++. In particular, it is undefined behavior to hold a reference to memory which is mutable aliased. In this case, the mutable alias comes through the Builder owning it.

As I see it, all of these little issues are a falling out of the bigger problem of the unclear ownership model between the internal components of the Reader and Builder types. A lot of boxes, a lot of pointers, etc. Perhaps there is a good reason why the ownership can't be more straightforward, but I haven't been able to figure it out from reading through.

A message::Builder should be Send if A is Send. The scary non-Send stuff only happens during a mutable borrow of the message::Builder, as through init_root().

What originally threw me off is that there is an unsafe impl of Send on a type (Builder) which transitively includes an Rc. Normally this would be UB, but I think in this case it's OK because all of the refcounted references get moved into the new thread at the same time. This all circles back to the confusing ownership chains going on internally.

dwrensha commented 7 years ago

Closing because capnp-futures-rs now exists.