lilnasy / es-codec

An efficient, compact, extensible, zero-dependency binary serialization library for browser-server communication.
The Unlicense
6 stars 1 forks source link

Any plans for Integers #40

Closed nhrones closed 1 year ago

nhrones commented 1 year ago

My FDB-Tuple-codec handles Safe-Integers.
Any plan to handle integers in es-codec? Big space savings!

const testArray = ["users", 3]

FDB-Tuple encoding - number is SafeInteger

<Buffer 02 75 73 65 72 73 00 15 03> 

es-codec encoding

ArrayBuffer {
  [Uint8Contents]: <0c 02 09 05 75 73 65 72 73 06 40 08 00 00 00 00 00 00>,
  byteLength: 18
}
lilnasy commented 1 year ago

I do actually: #17.

I'm waiting on @MierenManz post about fast varint decoding, the current implementation only handles 32-bit integers.

nhrones commented 1 year ago

I set/use a flag to ignor integers for DenoKv encoding

   } else if (typeof item === 'number') {

      //HACK uses KvDB flag to emulate the DenoKv-codec, ignor integers
      if (!KvDB && Number.isSafeInteger(item) && !Object.is(item, -0)) {
         const absItem = Math.abs(item)
         let byteLen = numByteLen(absItem)
         into.need(1 + byteLen)
         console.log('Not DenoKv - number is SafeInteger')
         into.appendByte(Code.IntZero + (item < 0 ? -byteLen : byteLen))

         let lowBits = (absItem & 0xffffffff) >>> 0
         let highBits = ((absItem - lowBits) / 0x100000000) >>> 0
         if (item < 0) {
            lowBits = (~lowBits) >>> 0
            highBits = (~highBits) >>> 0
         }

         for (; byteLen > 4; --byteLen) into.appendByte(highBits >>> (8 * (byteLen - 5)))
         for (; byteLen > 0; --byteLen) into.appendByte(lowBits >>> (8 * (byteLen - 1)))

      } else {
         // Encode as a double precision float.
         const msg = (KvDB) ? "DenoKv 'forced' a double" : "Number is double"
         into.appendByte(Code.Double)
         console.log(msg)
         // We need to look at the representation bytes 
         // - which needs a temporary buffer.
         const bytes = Buffer.allocUnsafe(8)
         bytes.writeDoubleBE(item, 0)
         adjustFloat(bytes, true)
         into.appendBuffer(bytes)
      }
   }

   export const numByteLen = (num: number) => {
   let max = 1
   for (let i = 0; i <= 8; i++) {
      if (num < max) return i
      max *= 256
   }
   throw Error('Number too big for encoding')
}
nhrones commented 1 year ago

I'd like to use es-codec for FDB(Kv) use, but the type-codes don't align well.
Was there an existing standard you used for the codes?

es-codec codes

const STRING    = 0b00001001 = 9
const NUMBER    = 0b00000110 = 6

FDB-Tuple-codec type codes

   Null = 0,
   Bytes = 1,
   String = 2,
   //Nested = 0x5,
   IntZero = 0x14,
   PosIntEnd = 0x1d,
   NegIntStart = 0x0b,
   Float = 0x20,
   Double = 0x21,
   False = 0x26,
   True = 0x27,
   UUID = 0x30,

https://github.com/apple/foundationdb/blob/main/design/tuple.md

lilnasy commented 1 year ago

It doesn't follow a spec, I went with what worked best for javascript, and there will be a spec soon. Maybe you can make a fork of es-codec work for FDB, but I don't think byte-compatibility with it (or any other format, for that matter) makes sense for it.

Side note: at one point, zero meant null, same as FDB, but I found that to be error-prone because absence of data is also represented with zeros.

nhrones commented 1 year ago

Yes; they use the Null ( 0 ) to demarcate a null-terminated string. Where ascii - 2 (STX) Start of Text marks a strings start. FoundationDB/Apple put a lots of work in this coding. It wouldn't be bad to follow. Deno-Kv multipart-key is a direct subset of Apples' FDB-Tuple-codec.
It would be nice if 'es-codec' could align with both! I could toss mine and use yours! It would make it very universal.

I linked the spec above. It has a lot of good arguments for using it.

nhrones commented 1 year ago

FDB-Tuple - used by:

GO - https://pkg.go.dev/github.com/apple/foundationdb/bindings/go/src/fdb/tuple JAVA - Package com.apple.foundationdb.tuple RUST - https://docs.rs/fdb/latest/fdb/tuple/index.html SWIFT - https://github.com/kirilltitov/FDBSwift/tree/master/Sources/FDB/Tuple

Tuple Definition:

(1) In a relational database, a tuple is one record (one row).

(2) A set of values passed from one programming language to another application program or to a system program such as the operating system. Typically separated by commas, the values may be parameters for a function call or a set of data values for a database.

lilnasy commented 1 year ago

To make sure I understand, es-codec should become an implementation of FDB-Tuple encoding?

nhrones commented 1 year ago

They are already equivalent with the exception of the type-code-numbers used! Remember that this is more encompassing than just multipart-keys.

Deno.Kv is just a subset of the semantics of FDB. It was designed to allow CLI-SQLite to be used to persist FDB-Tuples(records.

nhrones commented 1 year ago

FDB tuples are the recommended way to encode keys and values in Foundationdb because tuples carry some important benefits compared to using JSON (or any other encoding method):

lilnasy commented 1 year ago

I would like to enable your use case but I see big and small reasons not to emulate FDB Tuple encoding.

nhrones commented 1 year ago

Thanks for the consideration. Any problem with me using some of your es-codec code base? I need to get rid of the node-buffer stuff in mine.

lilnasy commented 1 year ago

I encourage it, it's public domain.

nhrones commented 1 year ago

thanks again. I'll give credit.