Open wyfo opened 1 year ago
@ultrabug
I believe we already have a very similar issue: https://github.com/scylladb/scylla-rust-driver/issues/462 .
I agree that the eager allocations are a problem. Both your approach and the one described in #462 address the issue by pointing to the memory of the original, unserialized frame. However, the main difference is with respect to lifetimes: you suggest to use reference counting, and I suggest using explicit lifetimes. I think there is value in both - reference counting is easier to use, but explicit lifetimes gets rid of the (AFAIK atomic) reference counting and makes it possible to use standard library types such as &str
and &[u8]
instead of string::String<Bytes>
and Bytes
.
We could unify both approaches if we used the interface from #462. Instead of deserialize
taking data: &'a [u8]
, we could pass &'a Bytes
so that it is both possible to borrow the unserialized frame as well as refer to it via shared pointer semantics.
There is one potential argument against reference counting that I can see. Let's say that you fetch a large page of results, but you only decide to keep one string::String<Bytes>
from it. The Bytes
need to refer to the unserialized response frame and will keep it alive, leading to a space leak. I'm not sure how frequently this problem could happen in practice, but it definitely makes memory usage harder to reason about. This problem does not happen with explicit lifetimes, as the user of the types that borrow from the frame needs to manually keep the frame alive or convert the types to their owned variants.
WDYT @wyfo @cvybhu?
P.S: I have a work-in-progress branch for #462, I worked on it from time to time but it's quite untidy and didn't even manage to make it compile yet: https://github.com/piodul/scylla-rust-driver/tree/462-more-efficient-deserialization
I don't know how I missed #462 ... -_-'
We could unify both approaches if we used the interface from https://github.com/scylladb/scylla-rust-driver/issues/462. Instead of deserialize taking data: &'a [u8], we could pass &'a Bytes so that it is both possible to borrow the unserialized frame as well as refer to it via shared pointer semantics.
Agree 100% with it! It's indeed way better this way. I went with Bytes
because of my use case, spawning tasks with 'static
lifetime, but I've a lot of smaller queries where &'a str
would be more suited. However, I think we will need to pass both &mut &'a [u8]
and &'a Bytes
, the first one being used to deserialize, and the second one to use Bytes::slice_ref
when needed.
There is one potential argument against reference counting [...] The Bytes need to refer to the unserialized response frame and will keep it alive, leading to a space leak.
I agree too, but as stated above, giving the choice to the user solves this issue IMO.
P.S: I have a work-in-progress branch for https://github.com/scylladb/scylla-rust-driver/issues/462, I worked on it from time to time but it's quite untidy and didn't even manage to make it compile yet: https://github.com/piodul/scylla-rust-driver/tree/462-more-efficient-deserialization
Actually, our works are pretty similar ... too much similar in fact, my inspiration could become questionable :sweat_smile: at least, it seems to be the right direction.
Before continuing, we need to answer some questions:
CqlValue
to either use &[u8]
or Bytes
? I think we could try to make it generic: CqlValue<Blob=Vec<u8>>
with the ability to choose between Vec<u8>
, Bytes
and &[u8]
(I don't know if this is feasible, but why not try).rows_typed
/single_row
/etc. and maybe deprecate it?ParseError
could be cleaned a little bit, as ParseError::CqlTypeError
seems to no more be used, and I think the only io::Error
of ParseError::IoError
should be UnexpectedEOF
raised by byteorder
, but it should then be ParseError::BadIncomingData
.string::String<Bytes>
, or hack it storing only offsets on the frame bytes, retrieving the strings with a method? so, no more shallow clone but a less intuitive API. As it seems to me that metadata are rarely used directly (it's mainly serve to deserialization, and only the ColumnType
, not the strings), I think it will be ok to optimize it this way.Maybe I can open a draft PR to start more precise discussion about implementation/naming/etc.
what about strings contained by metadata? should we use string::String
, or hack it storing only offsets on the frame bytes, retrieving the strings with a method? so, no more shallow clone but a less intuitive API. As it seems to me that metadata are rarely used directly (it's mainly serve to deserialization, and only the ColumnType, not the strings), I think it will be ok to optimize it this way.
Actually, I've just realized that it's not really necessary to deserialize metadata. In fact, there could be a column type parsing iterator, the same way as there will be a row parsing iterator. Strings would just be ignored. Of course, fully deserialized metadata could still be available using a dedicated method.
Agree 100% with it! It's indeed way better this way. I went with
Bytes
because of my use case, spawning tasks with'static
lifetime, but I've a lot of smaller queries where&'a str
would be more suited. However, I think we will need to pass both&mut &'a [u8]
and&'a Bytes
, the first one being used to deserialize, and the second one to useBytes::slice_ref
when needed.
Using &'a mut Bytes
should be sufficient (forgot about the mut
before). Bytes
implement Deref<Target=[u8]>
so you can borrow directly from it, no need to use Bytes::slice_ref
.
Before continuing, we need to answer some questions:
* should we modify `CqlValue` to either use `&[u8]` or `Bytes`? I think we could try to make it generic: `CqlValue<Blob=Vec<u8>>` with the ability to choose between `Vec<u8>`, `Bytes` and `&[u8]` (I don't know if this is feasible, but why not try).
I can see how CqlValue could be changed to a generic, however I can see some challenges:
BigInt
and BigDecimal
,I'm OK with postponing solving those issues for later, as the API suggested in #462 would allow deserializing query results directly to the types requested by the users. The CqlValue would still be a type which owns its data.
* should we try to keep the current API `rows_typed`/`single_row`/etc. and maybe deprecate it?
Why would you want to deprecate/remove them?
* a little parenthesis about error type: `ParseError` could be cleaned a little bit, as `ParseError::CqlTypeError` seems to no more be used, and I think the only `io::Error` of `ParseError::IoError` should be `UnexpectedEOF` raised by `byteorder`, but it should then be `ParseError::BadIncomingData`.
I agree, the error hierarchy needs some rethinking and simplification...
* Should we close this issue as duplicate? or keep this one because discussion seems to happened here? or keep both?
Let's keep both issues open and close them later in one go. Both of them contain valuable information IMO.
Maybe I can open a draft PR to start more precise discussion about implementation/naming/etc.
Sure, sounds like a good idea. We can continue the discussion on the PR.
Using &'a mut Bytes should be sufficient (forgot about the mut before). Bytes implement Deref
so you can borrow directly from it, no need to use Bytes::slice_ref.
In fact, you can't borrow a slice and calling Bytes::advance
on the same Bytes
. So you need to have an immutable Bytes
buffer, to borrow immutable slice (or shallow clone a new Bytes
), and a mutable slice/offset to keep parsing advancement; slice and offset are roughly the same information, but slice is a fat pointer (2 usize) vs 1 usize
for the offset, so maybe the second could be a little bit more efficient to pass around. I need to bench that.
I'm OK with postponing solving those issues for later [...] The CqlValue would still be a type which owns its data.
I'm fine with that.
Why would you want to deprecate/remove them?
The error type will change, as it will have to include parsing error, but that's a minor concern. However, they also take QueryResult
by value; this is not an issue for untyped methods, as CqlValue owns its data, but it prevents typed methods to deserialize &[u8]
/&str
, defeating one of the purpose of the refactoring. Maybe we can cheat and modify the interface to pass self
by reference instead, so it should not break most of the code.
There is also a discussion about FromRow
, because as written earlier, FromRow
could implement the row deserializer trait, but that would imply having a temporary deserialization to Vec<Option<CqlValue>>
, with a big overhead. That's why I think the trait should be deprecated.
Using &'a mut Bytes should be sufficient (forgot about the mut before). Bytes implement Deref
so you can borrow directly from it, no need to use Bytes::slice_ref. In fact, you can't borrow a slice and calling
Bytes::advance
on the sameBytes
. So you need to have an immutableBytes
buffer, to borrow immutable slice (or shallow clone a newBytes
), and a mutable slice/offset to keep parsing advancement; slice and offset are roughly the same information, but slice is a fat pointer (2 usize) vs 1usize
for the offset, so maybe the second could be a little bit more efficient to pass around. I need to bench that.
OK, I see the problem now... I'm not sure what would be the best way to deal with that. It sounds like we would like to have something that behaves as a slice, but allows to take ownership of it via Bytes::slice_ref
. Maybe something like this exists already?
Why would you want to deprecate/remove them?
The error type will change, as it will have to include parsing error, but that's a minor concern. However, they also take
QueryResult
by value; this is not an issue for untyped methods, as CqlValue owns its data, but it prevents typed methods to deserialize&[u8]
/&str
, defeating one of the purpose of the refactoring. Maybe we can cheat and modify the interface to passself
by reference instead, so it should not break most of the code.
I remember that I got this problem while working on my work-in-progress implementation and I had to introduce an as_typed
method so that borrowing is possible. The into_typed
method should still be usable with the types that own the data. AFAIK if we restrict the types that the function could return only to types with lifetime 'static
then it should work.
There is also a discussion about
FromRow
, because as written earlier,FromRow
could implement the row deserializer trait, but that would imply having a temporary deserialization toVec<Option<CqlValue>>
, with a big overhead. That's why I think the trait should be deprecated.
I agree about FromRow
, it will probably become obsolete.
Note: this should be fixed by https://github.com/scylladb/scylla-rust-driver/pull/665 when it is merged. It follows some ideas that were discussed here.
Note: this should be fixed by #665 when it is merged. It follows some ideas that were discussed here.
Note: this should be fixed by #665 when it is merged. It follows some ideas that were discussed here.
665 was closed - not merged. Unsure where we stand now with this.
I'm currently working on deserialization refactor, based on closed @piodul's PR.
Currently,
QueryResult
deserialization deserialize all rows eagerly, and does a lot of allocations:The last point, which can be quite important in workflows with a lot of strings/heavy blobs, is pretty easy to address. Indeed, the crate already use
bytes::Bytes
, so it query result's bytes could be kept along the deserialization, and blob column be deserialized asBytes
, while string column could usestring::String<Bytes>
. I ignore the reason why raw&[u8]
are used, maybe it's because of the API ofbyteorder
, butbytes
crate also provides an endian-aware API, sobyteorder
could be dropped in favor ofbytes
.Allocating a vector for all rows is also a relative overhead regarding queries returning only one row. Instead of deserialized rows, raw
Bytes
could be stored intoQueryResult
. It could then have method returning an iterator of rows deserialized at each iteration. It would still be possible to obtain a vector of rows just by collecting the iterator.Columns deserialization could also avoid using a vector, using a trait system similar to
FromRow
to deserialize rows into tuples. By the way, compatibility of the tuple could be checked only once before iterating the rows (row deserialization would still return aResult
because it must still check there is enough bytes). Old API could still be accessible by makingVec<Option<CqlValue>>
implement the row deserialization trait; actually, there could also be anIterator<Item=Option<CqlValue>>
implementing the trait.To illustrate these points, I've implemented a quick and dirty POC in a branch in order to run some quick benchmarks; they show a (very) significant improvement in terms of memory consumption and performance. I can open a draft PR to make the POC simpler to visualize.
Of course, some of these changes would be breaking:
Response
, but it has to be modified to useBytes
instead of&[u8]
;QueryResult.rows
is public, but it would have to be replaced, for example bybytes: Bytes, row_count: usize
;CqlValue::Blob
/CqlValue::Text
/CqlValue::Ascii
should useBytes
/String<Bytes>
; actually, this change should not be required, as rawCqlValue
may not be used so much anymore.On the other hand, the typed API
rows_typed
/single_row_typed
/etc. (and evenrows()
) could stay relatively untouched, asFromRow
could implement the deserialization trait too , and breaking changes above seems quite minor to me.P.S. There are also some strings in
ColumnSpec
andColumnType
which could also be modified to usestring::String<Bytes>
; it would save the few remaining allocations, while being a minor breaking change.