Pass position data to Rule.onMatch?

YellowAfterlife commented 6 years ago

Hello!

Is it possible or would it be reasonable to add support for getting row (and perhaps column) data inside of Rule.onMatch? Or what is the usual method of handling this?

Currently, BackgroundTokenizer:$tokenizeRow already passes the row to Tokenizer:getLineTokens, and getLineTokens knows the column, but does not store the row argument, and does not pass either of these to rule.onMatch.

Considerations: Existing modes occasionally call getLineTokens without the row argument, most commonly in getNextLineIndent to check if a line ends with a comment. A note would need to be added about this.

Context: Target language (GameMaker Language) uses scoping logic slightly similar to that of ECMAScript - if you don't specify where you want a variable to be read from/written to, and it was not declared as a local or global, it is assumed to be on the executing instance. At the same time, the language allows to define multiple functions in a single file while having them conveniently delimited (such as ^#define (\w+); no nesting). By introducing position data to onMatch, it becomes possible to tell which function in the file an identifier-token belongs to, aiding the viewer by color-coding each kind of scope access differently:

nightwing commented 6 years ago

How are you using the row in onMatch function. One problem with passing it, is that inserting a line would potentially make all the tokens after that line invalid, even if state doesn't change.

YellowAfterlife commented 6 years ago

In this case (and most similar uses, I suppose) to find what "scope" that a token belongs to, the program traverses from token's line upwards until it finds a declaration line (such as #event draw shown in screenshot). The name of the later is then used to get a name->token type lookup map and see if the identifier is a local variable or not.

That information (line number->scope) is actively cached, so if there's a scope known for line 4, function for a token on line 5 will check the line 5 for a declaration and assume the scope to be that of line 4 if there isn't one. Cache is invalidated whenever the total line count changes.

That said, this works remarkably well for this case - inserting a line shifts all subsequent lines downwards, but also shifts their declarations, so the identifiers still belong to the same scopes and highlighting remains valid even though only 1-2 lines were re-tokenized.

An edge case to this is splitting the function into two (which will not automatically re-tokenize most of the new second half), but it is rather rare to do that while still having shared variables between the two. However, even then, editor.session.bgTokenizer.start(0) is called shortly after a file is saved (after re-indexing actual variable declarations per scope and other "heavier" logic) so things don't stay broken for too long.

github-actions[bot] commented 2 years ago

This issue has not received any attention in 1 year. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

YellowAfterlife commented 2 years ago

I'm still doing this and now also passing the line-string itself to onMatch.

I have also experimented with supporting "complex" states (e.g. { state: "start", bracketDepth: 1, func: "draw" } vs just "start") as a way to implement semantic syntax highlighting, but doing so more or less requires implementing a linter inside a syntax highlighter, which is more complex.

github-actions[bot] commented 8 months ago

This issue has not received any attention in 1 year. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

YellowAfterlife commented 8 months ago

Not much has changed, though it did occur to me that there are more troubles to complex states than originally envisioned.

ajaxorg / ace

Pass position data to Rule.onMatch? #3500