trishume / syntect

Rust library for syntax highlighting using Sublime Text syntax definitions.
https://docs.rs/syntect
MIT License
1.85k stars 130 forks source link

sublime-syntax branch_point support #271

Open keith-hall opened 4 years ago

keith-hall commented 4 years ago

Sublime Text build 4050 introduced a new feature to sublime-syntax grammars called branch points. The docs haven't been published yet, but the packages which ship with Sublime Text have started to use this new functionality. i.e. https://github.com/sublimehq/Packages/commit/c08f85346a04d2e12990c3008e7666306b59a05b

This essentially uses backtracking to effectively allow looking ahead multiple lines (up to 128) etc. and enables non-deterministic parsing as described at https://github.com/SublimeTextIssues/Core/issues/2241

My understanding is that some official documentation should be coming soon, but I wanted to log this here now in case we want to get a head start on it i.e. by experimenting with it in Sublime to see how it works and thinking about how we can go about implementing it for syntect. This also serves as a note that updating the sublimehq/Packages submodule will likely cause those syntax definitions relying on this new feature to not work as expected until syntect adds support for it.

jrappen commented 4 years ago

For other possible breaking changes, comparing the st3 branch against master might also help:

jrappen commented 4 years ago

It might make sense to point the Packages gitmodule at the st3 branch until this is fixed.

jrappen commented 4 years ago

Changed gitsubmodule target for Packages to st3 branch for now in #279. This should be reverted to targetting master branch once all Sublime syntax features are supported by syntect.

Keats commented 3 years ago

Was some documentation added on that? Is it a big feature?

keith-hall commented 3 years ago

It's a big feature, the current line by line api isn't suitable, because parsing the next line could affect the tokens for the previous line if there was a branch point that failed and the context stack/parse state rewound. There are some docs, but sublimehq have asked not to link to them publicly during the closed beta - you can find a link from the sublime text discord server.

jrappen commented 3 years ago

New docs can be found at:

michaelblyons commented 2 years ago

Now that BNF generators for sublime-syntax (primary (Rust), alternative (Python)) have been around for a little while, I suspect the number of branch_point-requiring syntaxes is going to jump.

varungandhi-src commented 2 years ago

I did a quick search for which grammars are using a branch: or fail: construct and the results with the master branch of SublimeHQ/Packages (0cee68b3d87) are:

Batch File/Batch File.sublime-syntax
C#/C#.sublime-syntax
Git Formats/Git Commit.sublime-syntax
Java/Java.sublime-syntax
JavaScript/JSX.sublime-syntax
JavaScript/JavaScript.sublime-syntax
JavaScript/TSX.sublime-syntax
JavaScript/TypeScript.sublime-syntax
Lua/Lua.sublime-syntax
Markdown/Markdown.sublime-syntax
PHP/PHP Source.sublime-syntax
Python/Python.sublime-syntax
SQL/SQL.sublime-syntax
jrappen commented 2 years ago

... grammars which are using branch: ...

also add these pending re-writes from PRs: