openactive / data-model-validator

The OpenActive data model validator library
MIT License
1 stars 1 forks source link

Invalidate any integers greater than 2^53 #439

Open lukehesluke opened 1 year ago

lukehesluke commented 1 year ago

Context

JSON technically has no limits on size or precision of numerics. However, practically, the programming language / library used to interpret the JSON will make assumptions.

I think it would be reasonable to put at least a size limit of 2^63 - 1 on (positive) JSON integers so that they can easily be processed by languages with something like the C# or Java (signed) "long" type, the default PHP integer type on 64-bit systems, etc.

However, I would suggest going further and reducing this size limit to 2^53 - 1, which is the largest "safe" integer that can be represented in JavaScript's primitive number type. JavaScript represents a large part of the OpenActive community and also an absolutely necessary one due at least to web applications

Proposal

Checklist

RE BigInt

JavaScript has a BigInt type, which can represent larger integers, but this is awkward to use with JavaScript's standard libraries and OpenActive's data as stands. MDN suggests using JSON.parse's reviver arg to parse large numbers into BigInts in JSON (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt#use_within_json) but this can only work if the number was serialized into a string, and integers are not serialized into strings for OpenActive data.

The only other alternatives that allow a JavaScript user to therefore parse OpenActive data that has large integers is to either create a custom JSON parser (!) or use a library like https://github.com/sidorares/json-bigint https://github.com/josdejong/lossless-json, where there unfortunately don't seem to be many options

lukehesluke commented 1 year ago

One thing I will add:

I am currently getting around this on a Node.JS app that consumes RPDE feeds by using BigInts to work with those feeds whose modifieds exceed 2^53.

It is quite a chore compared to how it would be if numbers had size limits imposed. This is because at every point at which object serialization or deserialization occurs (e.g. logs, database queries, queues, HTTP, etc) or where certain libraries are used, special handling has to be introduced, otherwise errors accumulate or there is a significant loss in accuracy (or there are sorting issues in cases where BigInts are automatically converted to strings)

lukehesluke commented 11 months ago

Further difficulties with JavaScript: I've tried using https://github.com/sidorares/json-bigint in some of my own code to handle OpenActive data and can unfortunately say that it is not production-ready. There is a known issue in which it will attempt to convert floating point numbers to BigInts (if there are enough digits), which will cause an error. This means that the following OpenActive data will raise an error if processed with json-bigint:

{
  "@type": "GeoCoordinates",
  latitude: 51.52665709999999,
  longitude: -0.0882679
}

Luckily, there's another library, https://github.com/josdejong/lossless-json, which does a better job. You can replicate something like the expected behaviour of json-bigint with the following:

const { parse, isInteger } = require('lossless-json');

/**
 * @param {string} value
 * @returns {number | bigint}
 */
function convertNumberToBigIntIfLargeEnoughInt(value) {
  if (isInteger(value)) {
    const asInt = parseInt(value, 10);
    /* Note we consider equality to either of the bounds to be "too large" just
    to be extra cautious against the effects of precision loss */
    if (asInt >= Number.MAX_SAFE_INTEGER || asInt <= Number.MIN_SAFE_INTEGER) {
      return BigInt(value);
    }
    return asInt;
  }
  return parseFloat(value);
}

function jsonParseAllowingBigInts(text) {
  return parse(text, null, convertNumberToBigIntIfLargeEnoughInt);
}

> var s =
  '{"modified": 1689163234269127120371280937102987,"data": {"geo": {"@type": "GeoCoordinates","latitude": 51.52665709999999,"longitude": -0.0882679}, "remainingSpaces": 1}}';
> jsonParseAllowingBigInts(s)
{
  modified: 1689163234269127120371280937102987n,
  data: {
    geo: {
      '@type': 'GeoCoordinates',
      latitude: 51.52665709999999,
      longitude: -0.0882679
    },
    remainingSpaces: 1
  }
}