khonsulabs / bonsaidb

A developer-friendly document database that grows with you, written in Rust
https://bonsaidb.io/
Apache License 2.0
1.01k stars 37 forks source link

Change Key::from_ord_bytes to accept Cow #254

Closed ecton closed 1 year ago

ecton commented 2 years ago

From a conversation with @asonix on Discord, a suggestion was made to pass Cow<'k, [u8]> instead of &'k [u8]. After some thought, I realized this was a wonderful idea even though it initially sounds like it complicates things.

One of the ugly parts of the to-be-released CompositeKeyEncoder is that it doesn't support values that contain null bytes. The reason is that borrowed data couldn't be supported properly in those situations due needing to escape the data, which necessarily requires extra allocations that cannot be borrowed from. By passing the Cow instead of always giving a slice, it gives the implementor of Key the option of implementing either version or returning an error if it can't fulfill the deserialization given the lifetime requirements.

ecton commented 1 year ago

Unfortunately this isn't possible to support while still supporting composite key decoding. The reason is quite complicated, but it boils down to this:

With from_ord_bytes(bytes: Cow<'a, [u8>), any implementor that attempts to borrow from bytes must borrow from a local variable, since the only way to access bytes as a &[u8] is to either use the lifetime of bytes or needs to only match Cow::Borrowed.

This causes all sorts of issues when trying to apply these changes to composite key decoding, since it requires using only part of the input bytes for each field. Additionally, it has performance implications with Option/Result encoding due to those types encoding a single extra byte.

If anyone wants to try their hands at fixing the issues on this branch: https://github.com/khonsulabs/bonsaidb/compare/a9c6af0dc161daaa41ced39b5822f3d63f219039...ecton:cow_key?expand=1, it should be fairly straightfoward to reproduce the errors. There needs to be an update to arc-bytes for one type -- if that hasn't happened by the time someone plays with this, replacing that From implementation with a todo!() should allow the lifetime errors to be the only remaining issues.

ecton commented 1 year ago

After thinking about the approach @asonix proposed in #277, I'm reopening this for another attempt.