capnproto / capnproto-rust

Cap'n Proto for Rust
MIT License
2.06k stars 222 forks source link

Allowing construction of NoAllocSliceSegments with buffer and NoAllocBufferSegmentType #488

Closed hs-vc closed 7 months ago

hs-vc commented 7 months ago

It would be advantageous to enable the creation of NoAllocSliceSegments along with a buffer and NoAllocBufferSegmentType, particularly when the message has already undergone parsing. While this capability was present until version 0.17, it has since become unavailable, resulting in inefficiencies as parsing must be repeated during the construction of NoAllocSliceSegments.

The necessity for this functionality stems from the asynchronous reading of messages in the Cap'n Proto protocol format from sockets. In such scenarios, clients must parse the header (segments) to ascertain the message's length, thus obviating the need for re-parsing within NoAllocSliceSegments.

Is there any alternative approach available that can overcome this issue?

dwrensha commented 7 months ago

What if we add public visibility to ReadSegmentTableResult, read_segment_table(), and NoAllocBufferSegments::from_segment_table()? (These things are currently at private or crate-only visibility.)

Then I think you would be able get the message length from ReadSegmentTableResult before constructing the BufferSegments.

hs-vc commented 7 months ago

What if we add public visibility to ReadSegmentTableResult, read_segment_table(), and NoAllocBufferSegments::from_segment_table()? (These things are currently at private or crate-only visibility.)

Then I think you would be able get the message length from ReadSegmentTableResult before constructing the BufferSegments.

In my case, I don't need to use read_segment_table() which requires a complete slice because I need to parse the header while reading from the socket to get complete messages. So I am reading bytes partially from the socket and parsing the header directly to determine the total message length. In fact, exposing NoAllocBufferSegments::from_segment_table() to public seems sufficient.

BTW, is there a reason why NoAllocBufferSegmentType is defined as:

enum NoAllocBufferSegmentType {
    SingleSegment(usize, usize),
    MultipleSegments,
}

Instead of:

enum NoAllocBufferSegmentType {
    SingleSegment,
    MultipleSegments,
}

In the case of SingleSegment, segment_table_length_bytes must always be 8, and the message length is also inferred from the buffer. In fact, in version 0.17, NoAllocSliceSegments requires buffer only for both of SingleSegment and MultipleSegments

dwrensha commented 7 months ago

In the case of SingleSegment, segment_table_length_bytes must always be 8, and the message length is also inferred from the buffer.

Hm... I agree that it looks like the first parameter is not needed. The second parameter however might point to before the end of the buffer if from_buffer() is called on something that's longer than the message: https://github.com/capnproto/capnproto-rust/blob/bcaac56e960c1cf3dd45618af8fca0d09d86340a/capnp/src/serialize/no_alloc_buffer_segments.rs#L160-L163

dwrensha commented 7 months ago

I removed the first parameter in 9738fed7624eddff67d2e72d75d474eae74787f6.

dwrensha commented 7 months ago

I opened a PR for adding public visibility tot he relevant items: #489.

hs-vc commented 7 months ago

@dwrensha Thanks for your prompt help. I really appreciate it!

dwrensha commented 7 months ago

Closed by #489.