Open gajus opened 1 year ago
I would think it should be
{foo: BigInt(9_007_199_254_740_999n)}
, but I am not certain if that is even possible, nor would that be valid JSON.
The above representation is a spec violation according to IETF RFC8259: The JavaScript Object Notation (JSON) Data Interchange Format.
I think you should handle it similar to how Date
or RegExp
is commonly serialized in JSON. The most common cases I've come across is for BigInt
to be serialized into a string or into a plain number.
// POJO
{
// Serialize into integer with n at the end
commentId: 9_007_199_254_740_999n,
// Serialize into ISO8601 format
createdAt: new Date("2023-10-05T12:34:56"),
// Serialize into regular expression that can be passed back in to RegExp constructor
regEx: /abc[dD]/
}
// JSON
{
"commentId": "9007199254740999n",
"createdAt": "2023-10-05T12:34:56",
"regEx": "abc[dD]"
}
new BigInt(json.commentId)
new Date(json.createdAt)
new RegExp(json.regEx)
// POJO
{
// Serialize into an integer
commentId: 9_007_199_254_740_999n,
// Serialize into UNIX epoch in seconds
createdAt: new Date("2023-10-05T12:34:56")
}
// JSON
{
"commentId": 9007199254740999,
"createdAt": 1696509296,
}
This specification allows implementations to set limits on the range and precision of numbers accepted.
Number.MAX_SAFE_INTEGER
:
libjson
are generally implemented more for interoperability and will not parse integers larger than 53 bits of precision (see: Stack Overflow – Does libjson support 64 bit int types?)Number.MAX_SAFE_INTEGER
into BigInt
automatically: (e.g. json-with-bigint)reviver
/replacer
arguments for JSON.parse
/JSON.stringify
functions.A JSON number such as 1E400 or 3.141592653589793238462643383279 may indicate potential interoperability problems, since it suggests that the software that created it expects receiving software to have greater capabilities for numeric magnitude and precision than is widely available.
Note that when such software is used, numbers that are integers and are in the range [-(253)+1, (253)-1] are interoperable in the sense that implementations will agree exactly on their numeric values.
Thank you.
What do you think about just throwing an error if an attempt it made to serialize BigInt into JSON?, i.e. require that user cast the value to X before serialization, e.g.
await pool.oneFirst(sql.unsafe`
SELECT json_object_agg('foo', ${BigInt(9_007_199_254_740_999n)}::bigint::text)
`);
The above is perfectly fine.
My thinking is that this way the handling of the value does not depend on Slonik and it is less likely to produce unexpected results.
The error message could include suggestions of how to cast the value.
I think throwing an error may be a valid behavior because BigInt
introduces a footgun that will take many devs by surprise if they're not keenly aware of the implementation details and differences of JavaScript's BigInt
vs PostgreSQL's bigint
. That being said, I think it should be well documented and Slonik should provide escape hatch for advanced developers who know what they're getting themselves into (i.e. should be an opt-in functionality) instead of blocking them from doing their work.
Just to highlight how error-prone handling BigInt
can be, I need to remind myself that BigInt
type in JavaScript is very different to bigint
data type in PostgreSQL even though they have pretty much identical type names.
In PostgreSQL bigint
is just a type alias for int8
, or just a signed 8-byte/64-bit integer which has an upper bound and isn't arbitrarily infinite. In this case, it's 263 - 1 = 9,223,372,036,854,775,807.
However, BigInt
in JavaScript theoretically has no upper bound (practically, Chrome defines a limit of billion bits, so we can infer that Node.js' V8 engine also has the same limitation). You can make a number arbitrarily large past the upper bound of a 64-bit signed integer. BigInt
does not prevent you from creating a number that's many orders of magnitude higher such as 170,141,183,460,469,231,731,687,303,715,884,105,727 (2127 - 1) which is an integer with 39 digits.
Reading 64-bit integer out of PostgreSQL as BigInt
is not a big deal but you can very easily overflow PostgreSQL's bigint
data type by binding BigInt
value that's larger than 263 - 1.
Therefore, the most analogous data type in PostgreSQL is not actually bigint
, but rather numeric
type. numeric
is one of "user-defined precision" types that lets you specify up to 131,072 digits before the decimal point and up to 16,383 digits after the decimal point. I'm pretty sure for most intents and purposes, this is "unlimited upper bound". (Given that number of atoms in the observable universe is about 1082, or 83 digits before the decimal point.)
Basic rule of thumb should be:
Use BigInt
if selecting existing data out of PostgreSQL table with bigint
column and there's no value binding anywhere:
const schema = z.object({ id: z.bigint(), name: z.string() });
const foo = pool.many(sql.type(schema)`SELECT id, name FROM foo`);
Binding BigInt
values should be done with caution:
bigint
/int8
data type, some input validation is needed to make sure the value will always be less than or equal to 263 - 1. (e.g. z.bigint().lte(9_223_372_036_854_775_807n)
)Otherwise, type cast to ::numeric
so that you don't run into integer overflows that's pretty much only detectable at runtime AFTER you deployed your code to production.
For example, given this query:
const biggerThan64bit = 9223372036854775808n;
const schema = z.object({ biggerThan64bit: z.bigint() });
const num = await pool.oneFirst(sql.type(schema)`SELECT ${biggerThan64bit}::bigint AS biggerThan64bit`);
You'll get a query error like this:
Query 1 ERROR: ERROR: bigint out of range
This query is perfectly valid:
const biggerThan64bit = 9223372036854775808n;
const schema = z.object({ biggerThan64bit: z.bigint() });
const num = await pool.oneFirst(sql.type(schema)`SELECT ${biggerThan64bit}::numeric AS biggerThan64bit`);
In raw psql client, PostgreSQL seems to take Option 2 and serialize into plain number for JSON:
SELECT json_build_object('biggerThan64bit', '9223372036854775809'::numeric) AS big_json;
-- Returns {"biggerThan64bit": 9223372036854775809}
So at least PL/pgSQL dialect seems to treat that as a valid JSON fwiw, regardless of whether this causes issues for node-postgres driver.
Executing in TablePlus:
Very interesting. Learned a lot! Thank you
You're welcome!
numeric
/decimal
trades off performance for precision quite drastically so it should be used very sparingly for business use cases that require absolute precision (e.g. banking, finance transactions, etc.)For most use cases, 53 bits of integer precision should be enough. Even if you use them for primary key id fields, it gives you over 9 quadrillion unique rows before you have to migrate to something else like GUIDs (in which case, either your business is a unicorn or your engineering team is doing something very wrong). For storing datetime as UNIX epoch, even at millisecond granularity, 53-bit integer can describe dates 232,000 years into the future (highly doubt humanity, let alone software written in JavaScript will survive that long).
All this is to say, engineers should think before they start introducing BigInt
everywhere in the codebase because it should be a deliberate, educated decision after learning about the trade-offs.
given this test:
What is the expected value of the result?
I would think it should be
{foo: BigInt(9_007_199_254_740_999n)}
, but I am not certain if that is even possible, nor would that be valid JSON.I am leaning towards throwing an error.
Probably related to https://github.com/gajus/slonik/issues/513