What to do about `B` and `-2147483648` in mappings

nicolo-ribaudo commented 3 months ago

Given that until https://github.com/tc39/source-map/pull/119 we were very hand-wavy about how out base64 VLQs work, there is an ambiguity when it comes to decoding B (whose bit pattern is 00001).

My reading of our spec was that:

the continuation bit (the left-most one is 0) -> this VLQ is complete
the value bits (the middle 4) are 0000 -> the absolute value of the number is 0
the sign bit (the right-most one is 1) -> the value needs to be negated

and thus B decodes to "-0", which is simply 0 because we are working with integers.

@jridgewell's reading is that (@jridgewell is this correct? I tried applying this same "algorithm" to F and it does not work):

the continuation bit -> this VLQ is complete
the value bits are 0000 -> given that we are working with i32, the 31 right-most bits of the resulting value are all zeroes
the sign bit is 1 -> the i32 representation of the decoded value is 1 followed by 31 zeroes

and this B decodes to 1000_0000_0000_0000_0000_0000_0000_0000, which when interpreted as an i32 is -2147483648

I did some testing in various implementations:

Test 1

0;throw new Error("This is an error");
// { "version": 3, "sources": ["original.js"], "mappings": "AA0gggggCA,CABA,CA0gggggCA" }
//# sourceMappingURL=data:application/json;base64,eyAidmVyc2lvbiI6IDMsICJzb3VyY2VzIjogWyJvcmlnaW5hbC5qcyJdLCAibWFwcGluZ3MiOiAiQUEwZ2dnZ2dDQSxDQUJBLENBMGdnZ2dnQ0EiIH0=

0gggggC is 2^30+10, so in implementations that decode B as -2^31 the error will be reported as being thrown at line 2^30+10 - 2^31 + 2^30+10 = 20 (or 21 if 1-based)

In implementations that decode B as 0 this overflows, so the result is unspecified and test does not apply.

Test 2

0;throw new Error("This is an error");
// { "version": 3, "sources": ["original.js"], "mappings": "AACA,EABA" }
//# sourceMappingURL=data:application/json;base64,eyAidmVyc2lvbiI6IDMsICJzb3VyY2VzIjogWyJvcmlnaW5hbC5qcyJdLCAibWFwcGluZ3MiOiAiQUFDQSxFQUJBIiB9

C is 1, so in implementations that decode B as 0 the error will be reported as being thrown at line 1 + 0 = 1 (or 2 1-based)

In implementations that decode B as -2^31 this yields a negative number (-2147483647), so the test does not apply.

These are the results:

	Test1	Test2	Conclusion: B is decoded as
Node.js using `--enable-source-maps`	21 :heavy_check_mark:	-2147483646 :x:	-2^31
Node.js using `npm:source-map-support`	-2147483627 :x:	2 :heavy_check_mark:	0
Deno	2147483669 :x:	2 :heavy_check_mark:	0
Chrome	-2147483627 :x:	2 :heavy_check_mark:	0
Firefox	-2147483627 :x:	2 :heavy_check_mark:	0
Safari	-2147483627 :x:	2 :heavy_check_mark:	0
@jridgewell/sourcemap-codec	20 :heavy_check_mark:	-2147483647 :x:	-2^31

Bun

I also tried Bun using [this trick](https://x.com/jarredsumner/status/1828369775472636289). You need to prefix the file with `@bun` line, prefix the `"mappings"` string with `;`, and re-encode the source map to Base 64. However, in both test case it errors with "Could not decode sourcemap".

My proposal

We should clarify that B is decoded to 0 and not to -2^31, ratifying what Deno/Chrome/Firefox/Safari do.

I propose that we go even one step further, and say that encoders must encode 0 as A and not as B. This carves out some space in the mappings field in case we'll ever need it in the future: we can one day give a special meaning to ,B,, and tools that do not recognize this meaning would simply fallback to considering it as a no-op :)

jridgewell commented 3 months ago

(@jridgewell is this correct? I tried applying this same "algorithm" to F and it does not work):

Decoding is if (negative) return 0x8000_0000|-v. The -v ensures the bit pattern is 2s-complement for the OR, and the 0x8000_0000 ensure the sign bit is always set (it only matters for -0, who's 2s-complement is the same as regular 0).

These are the results:

There are also bugs with round-tripping values >= to 1,073,741,824.

Engine	1073741823	1073741824	2147483648
Chrome	✅ 1073741824	❌ -1073741823 (`>>` used)	✅ 1
Safari	✅ 1073741824	❌ -1073741823 (`>>` used)	✅ 1
Firefox	✅ 1073741824	✅ 1073741825 (i64 used internally)	❌ -2147483647 (and it takes like 10 minutes!)
Node native	✅ 1073741824	✅ 1073741825 (based on my old Chrome code)	✅ 1
Node source-map-support	✅ 1073741824	❌ -1073741823 (`>>` used)	✅ 1
Deno	✅ 1073741824	✅ 1073741825 (I can't find their impl)	❌ 2147483649
Bun	✅ 1073741824	✅ 1073741825 (u32 internally)	🟨 0 (supposed to be 1-based)
source-map-codec	✅ 1073741823	✅ 1073741824 (`>>>` used)	✅ 0

Values this large have the 31st bit set, and when we encode we perform (v << 1), shifting that 31st bit to the 32nd bit (the sign bit of the result!). If they don't use >>> internally, then the sign bit will not shift downwards when decoding, and the result will remain negative.

Rather than standardize broken behavior, I think we should fix the impls, including the proper decoding of B.

jridgewell commented 3 months ago

To help with this discussion, I built https://justin.ridgewell.name/integer-toy/

tc39 / source-map

What to do about `B` and `-2147483648` in mappings #123

My proposal