Closed rukai closed 6 months ago
Hi @rukai,
Yeah this is something I noticed too. In fact, the decode_mut
parser function you're using in shotover (and the same one I use in fred) doesn't use Bytes
at all with the nom interface for a few reasons, this being one of them.
The "main" decode_mut
parser function, which is the one I'm interested in using and supporting down the road (with BytesMut
as the input) does the following:
as_bytes()
on the BytesMut
to get a slice view into the buffer.(&[u8], usize)
where the second value tracks the parsing offset into the slice.usize
offsets (start, end)
into the buffer when they detect a frame. However, when it parses primitive values that can go entirely on the stack (i64
, etc), then it just stores that in the "fake frame". BytesMut
is known, perform one split
and freeze
on the BytesMut
.Bytes
to convert to the "real" frames. At this point we're operating on one Bytes
view that holds the entire set of nested frames in the buffer, so it's a zero-copy operation to slice
it. That slice
operation is still perhaps slower than slicing an actual slice, but it avoids moving or copying the data which is my real concern.This approach was a lot more work to implement (I had to write the parser twice essentially, but I probably could have abstracted it better with the nom traits), but I think it strikes a good balance. It allows for zero-copy parsing and doesn't require using Bytes
with nom
, therefore avoiding the added overhead of the Bytes
clone like you pointed out.
When I originally wrote 4.x I did the NomBytes
interface first, then implemented the decode_mut
feature interfaces separately from the ground up. I've been thinking more about this and I might either get rid of the non decode_mut
interfaces for both the performance reasons you mentioned, and the fact that I don't think anybody really needs that. From what I've seen it's pretty ubiquitous to have BytesMut
as your input to the parsing logic, but I could be wrong there. At a minimum I'll probably swap out the underlying implementation in the Bytes
parsing logic to use &[u8]
under the hood. Either way I'm not sure if anybody uses that (you and I both use the decode_mut
interface at least), so we'll see.
At the moment though I really need to focus on fred
so this is all stuff I likely wont get to for a few months.
Thanks for the in depth response!
It sounds like such an approach would be even trickier for parsing CQL (a text language instead of a binary protocol)
So I'm going to stick with allocating String
s for now but will definitely investigate your approach or something similar if parsing shows up when profiling.
This isnt really an issue for redis-protocol, but raising an issue here seemed like the easiest way to make a discussion on this topic.
I was experimenting with writing a parser for cql (cassandra sql) using nom and https://github.com/aembke/redis-protocol.rs/blob/main/src/nom_bytes.rs The experiment is available here https://github.com/rukai/cqlparser
I did the following:
&[u8]
and output an ast that allocates Strings everywhere.I can kind of see why this would occur, a Bytes would be more expensive to clone around than an
&[u8]
and nom does a lot of cloning going through all the combinators. Also maybe the logic in the Bytes Clone is preventing optimizations that normally keep nom performant?Did you benchmark a Bytes and non-Bytes equivalent parser before settling on a Bytes parser? Might be worth investigating that if you havent already.
If you did observe performance issues when combining nom and Bytes and found workarounds I would be interested in hearing them. I noticed that redis-protocol tends to manually implement some stuff that nom does for you, is that related to this problem?