Closed thempen closed 3 months ago
Probably the API is a bit confusing, and just to clarify, please note that in a "real world" application, the length
and the endIndex
properties are not used. In that example to calculate the endIndex
, it's taking into account the next token's startIndex
or the line length if the next token is null
.
So, right now it's working like this, imagine the following line:
// comment
These are the indexes:
0 1 2 3 4 5 6 7 8 9
/ / c o m m e n t
Right now, TextMate is returning the following tokens:
tokens[0]: startIndex=0; endIndex=2; length=2;
tokens[1]: startIndex=2; endIndex=10; length=8;
The startIndex
is inclusive, and the endIndex
is not inclusive, so it works like an interval [startIndex, endIndex)
. The length is a calculated property.
You can see here the implementation details.
I agree that probably the API would be more intuitive using the following values:
tokens[0]: startIndex=0; endIndex=1; length=2;
tokens[1]: startIndex=2; endIndex=9; length=8;
This implementation is a port of tm4e, so just copied the same behavior in that repo and I didn't want to change the behavior to match the implementation in the upstream repository.
Thank you very much for the quick response! Now it is clear to me.
Hello,
I am confused about the StartIndex, EndIndex and Length of IToken. I was programming a demo file to parse proto (ProtoBuf) messages. While parsing a comment line, the string had 26 chars. The EndIndex and Length value were both resulting in 27 chars.
This is the snippet of the textmate json:
For me, this looks like a bug, however in the example of the readme, there is also code, fixing the length values.
Is this a nasty fix, or is it intended behavior? Why do I need this behavior? I would like to use the length values to detect possible parsing errors, but without knowing the behavior, this is not possible.