Closed kdy1 closed 8 months ago
Hi, thank you for the contribution!
I think there is a fundamental misunderstanding here: it seems to me that the both decode_rmi
function and the loop in decode_regular
in which it is used assume that there is at most one range mapping per line. This is the only way I can see
if rmi != 0 && (line_index as u32) == rmi - 1 {
range_tokens.push(tokens.len() as u32)
}
making sense—if the current index within the line is the number of the range token, we push the current token index to the list of range tokens.
But I think the proposal is meant to work differently: there can be many range mappings per line, and the i
th bit in the rangeMappings
for that line (modulo bit/byte endianness, I actually find this slightly confusing) tells you whether the i
th mapping is one of them.
Consider this example: currently, decode_rmi("AAB") == 13
. This is correct insofar as "AAB"
makes the 13th mapping a range mapping, but the function shouldn't return 13, it should return the number with only the 13th bit set ¹. The current implementation clearly doesn't generalize to more than one range mapping per line.
Correctness aside, I'm really excited about this. These range mappings would've saved me a lot of pain in the past.
¹ Or better yet, as @Swatinem says, a bit vector.
I believe converting between 0- and 1-based indices causes some unnecessary confusion here.
range_tokens
.SourceMap::get_token
you binary search range_tokens
for the token's index, which is 0-based.test_decode_rmi
, you convert to 1-based indices, I assume for purposes of clarity?I would say remove the 1-based indexing in (4) and (5) (and arguably (3) as well) and just assume 0-based indices everywhere.
Just a heads-up—I haven't forgotten about this, I just haven't had the time to look at it.
@Swatinem I added some tests for lookup_token
The test now makes it a bit clearer what this proposal is trying to achieve.
So if I understand it correctly, new bitvec mapping adds this is_range
flag.
And the is_range
flag determines if the mapping is using the "offset" from the initial binary search lookup to then adjust the output line/column numbers.
With a more visual example:
/* foo */ if (some_other_code) { return; }
// ^ - original token
// v - output token
if (some_other_code) { return; }
// ^ - lookup position
So the offset between lookup position
and output token
is added to original token
, so that you end up exactly at the same position in relative code if it is merely shifted a bit by the sourcemap?
Yeap, exactly. This will be useful for the minifier, which does not have full input source locations.
This is the case because no tools emit a perfect source map while transpiling, and the mappings in inputSourceMap
is not enough. With this RFC, it will have much more information about the input.
See https://github.com/tc39/source-map-rfc/blob/main/proposals/range-mappings.md