observablehq / framework

A static site generator for data apps, dashboards, reports, and more. Observable Framework combines JavaScript on the front-end for interactive graphics with any language on the back-end for data analysis.
https://observablehq.com/framework/
ISC License
2.23k stars 91 forks source link

Cannot convert a BigInt value to a number #1119

Closed iconara closed 4 months ago

iconara commented 4 months ago

When I use the SQL feature, in some circumstances I get "TypeError: Cannot convert a BigInt value to a number" when trying to plot the results of a query.

It seems the Arrow table that is returned from queries can contain numeric types that Plot doesn't support. All I do is queries like this:

'''sql id=myTimeseries
SELECT date,
       SUM(my_integer_column) AS my_integer_column
  FROM my_table
 GROUP BY 1
 ORDER BY 1
'''

and pass that into Plot.

I also get the same error when I try to pass a value into a D3 formatter, for example:

'''js
const integerFormat = d3.format(',d')
'''

${integerFormat(myTimeseries.get(0).my_integer_column)}
Fil commented 4 months ago

Can you give more specifics of how you pass it to Plot? We've fixed a number of cases where it wouldn't work, but we might have missed some (https://github.com/observablehq/plot/issues/1278).

A way to fix this is to cast the values to "normal" numbers in DuckDB:

SELECT date,
       SUM(my_integer_column)::INT AS my_integer_column

Doing it in JavaScript is also possible but requires more work:

${integerFormat(Number(myTimeseries.get(0).my_integer_column))}

iconara commented 4 months ago

I tried narrowing down the problem to something reproducible:

'''sql id=rows
WITH data (n) AS (VALUES (1234567890), (9876543210))
SELECT SUM(n) AS n
FROM data
'''

Expected: ${1234567890 + 9876543210}<br>
Raw: ${rows.get(0).n}<br>
Number(): ${Number(rows.get(0).n)}<br>
parseInt(): ${parseInt(rows.get(0).n)}<br>
d3.format: ${d3.format(',d')(rows.get(0).n)}
typeof: ${typeof rows.get(0).n}<br>

${Inputs.table(rows)}

This will output:

Expected: 11111111100
Raw: 2521176508200
Number(): 1.2438821514721054e+31
parseInt(): 11111111100
d3.format:
TypeError: Cannot convert a BigInt value to a number
d3.sum: ${d3.sum(rows, (d) => d.n)}
TypeError: Cannot convert a BigInt value to a number
typeof: object

I also tried reducing the issue with Plot, and as far as I can tell the error comes from my code using Plot.groupX({y: 'sum'}, …), which I assume uses something like d3.sum under the hood. When I remove that it seems like it uses Number(…) to coerce the values. This doesn't cause an error, but ends up using the wrong values, like above.

As you can see there seems to be multiple issues. Number(…) does not correctly coerce the BigInt into a number. The expected number is large, but not larger than Number.MAX_SAFE_INTEGER. parseInt(…) has no problem coercing it into the correct number.

I can't fix the issue by casting to INT in my SQL, since the values are larger than 2^31.

iconara commented 4 months ago

It gets even weirder. If I add another row where I produce the same number in another way, Number(…) returns a different output for each row, but the raw value, and the value returned from parseInt(…) are the same:

'''sql id=rows
WITH data (n) AS (VALUES (1234567890), (9876543210))
SELECT SUM(n) AS n
FROM data
UNION ALL
SELECT 1234567890 + 9876543210 AS n
'''

Raw: ${rows.get(0).n}<br>
Raw: ${rows.get(1).n}<br>
Number(): ${Number(rows.get(0).n)}<br>
Number(): ${Number(rows.get(1).n)}<br>
parseInt(): ${parseInt(rows.get(0).n)}<br>
parseInt(): ${parseInt(rows.get(1).n)}<br>

Outputs:

Raw: 2521176508200
Raw: 2521176508200
Number(): 1.2438821514721054e+31
Number(): 4.460189873590169e+44
parseInt(): 11111111100
parseInt(): 11111111100
mbostock commented 4 months ago

You can cast to ::float in SQL.

mbostock commented 4 months ago

I’ve reduced this as:

SELECT 11111111100::INT128 AS n

In Apache Arrow, this type is represented as a DecimalBigNum instance, which is internally represented as an array. It appears to contain the following parts:

rows.get(0).n // [2521176508, 2, 0, 0]

It seems that Arrow’s bigNumToNumber is implemented incorrectly, as this returned the wrong value:

Number(rows.get(0).n) // 4.460189873590169e+44

However, bigNumberToString does appear to be correct:

String(rows.get(0).n) // "11111111100"

Which means you can get the correct number by forcing string coercion first, and then number coercion:

+String(rows.get(0).n) // 11111111100

But in any case, this is a bug in Apache Arrow, so we should file a bug there. As a workaround, I suggest that you avoid the INT128 type and instead cast to DOUBLE.

mbostock commented 4 months ago

This appears to be covered by https://github.com/apache/arrow/issues/37920.