Closed pkaminski closed 2 months ago
Ping — just wondering if this can be fixed, or if I should change my database configuration? I'm trying to create a database with a compact representation of keys that are conceptually variable-length arrays of 32 bit uints, and with only primitive values (JS string, number, boolean, null). (The values don't need to be ordered but ordered-binary
looked like the likely most efficient encoding for primitives.)
What is the reason for using ordered-binary instead of the default (msgpack) encoding if they aren't intended to be ordered? I think msgpack should generally provide efficient/fast encoding.
I didn't profile, but my assumption was that ordered-binary
would be a more efficient encoder if the values were all primitives. Sounds like I'm wrong, though?
By "efficient" do you mean size or speed? I would say msgpack would usually be faster (ordering imposes extra constraints), but that really can vary. By themselves, strings may be one byte more compact with ordered-binary, but there is extra decoding cost to finding the delimiters. Numbers can be more compact in either representation based on the number, by decoding is a little more complicated with ordered-binary. I would think multi-valued arrays (or primitives) are generally going to be faster and more compact with msgpack (because it uses a length encoding rather than delimiter-based encoding).
I think it would be possible to extend the size limits of ordered-binary in lmdb, although there are actually some intentional optimizations that are done based on the assumption of limited size, since the intended purpose of ordered-binary as an encoding is to support references to keys, which also have the same size limits.
I expected both size and speed benefits, but sounds like I'll get neither — I'll switch back to msgpack
. Thanks!
The size limitation on ordered-binary
makes sense but it might be good to mention it in the docs for value encoding schemes.
I switched to the msgpack
encoding. The only slight hitch was that I also needed to be able to serialize a special placeholder symbol, which msgpack
doesn't support, but a trivial addExtension
did the trick there.
When I set up the database with
keyEncoding: 'binary'
andencoding: 'ordered-binary'
, trying to write a large value against a small key fails with a "key was too large" error. Repro:Output: