mtth / avsc

Avro for JavaScript :zap:
MIT License
1.27k stars 147 forks source link

long encoding/decoding is not reversible for some large but safe js ints #455

Closed NickKelly1 closed 5 months ago

NickKelly1 commented 6 months ago

Large integers within the safe JS integer range can get pushed above the safe JS integer range during zigzag encoding/decoding causing a loss of precision (the zigzag encoded number has a float64 collision so can't be decoded back into the original number).

unsafe zigzag encoding calculation in Tap.ptototype.writeLong

Since these source numbers are in the safe JS integer range they pass the bounds check for long types

isSafeLong checks range [Number.MIN_SAFE_INTEGER + 1, Number.MAX_SAFE_INTEGER - 1]

LongType.prototype._check bounds check using isSafeLong

LongType.prototype._write bounds check using isSafeLong

Here are some examples:

import avsc from 'avsc'

const long = avsc.Type.forSchema({ type: 'long' })
console.log(`-4503599627370497: ${long.fromBuffer(long.toBuffer(-4503599627370497))}`)
console.log(`-4503599627370499: ${long.fromBuffer(long.toBuffer(-4503599627370499))}`)
console.log(`-4503599627370501: ${long.fromBuffer(long.toBuffer(-4503599627370501))}`)
console.log(`-4503599627370503: ${long.fromBuffer(long.toBuffer(-4503599627370503))}`)
console.log(`-4503599627370505: ${long.fromBuffer(long.toBuffer(-4503599627370505))}`)
console.log(`-4503599627370507: ${long.fromBuffer(long.toBuffer(-4503599627370507))}`)
console.log(`-4503599627370509: ${long.fromBuffer(long.toBuffer(-4503599627370509))}`)
console.log(`-4503599627370511: ${long.fromBuffer(long.toBuffer(-4503599627370511))}`)
// -4503599627370497: 4503599627370496
// -4503599627370499: 4503599627370498
// -4503599627370501: 4503599627370500
// -4503599627370503: 4503599627370502
// -4503599627370505: 4503599627370504
// -4503599627370507: 4503599627370506
// -4503599627370509: 4503599627370508
// -4503599627370511: 4503599627370510
mtth commented 5 months ago

Thanks for reporting @NickKelly1 - great find. The bounds should be tighter.