Closed lildude closed 1 year ago
cc @DaelonSuzuka
PS: I didn't know this repository was used as a basis for Linguist. This is good to know, as we'll need to mention this in the contributing guidelines since it has a few implications.
We've been using it for GDScript since 2020 thanks to https://github.com/github/linguist/pull/5000 😁
Thanks for reporting this. I'll try to look into these issues this weekend.
@lildude Is there documentation somewhere for how I can run your grammar compiler myself, so I can try and catch issues sooner?
@lildude Is there documentation somewhere for how I can run your grammar compiler myself, so I can try and catch issues sooner?
We don't have any docs other than the CONTRIBUTING.md for adding/replacing grammars and the steps I take when putting together a release.
The compiler doesn't have functionality to check just a specific grammar, but you can effectively do this by attempting to add it. This will attempt to compile it and report any problems. To do this, clone the repo and run script/bootstrap
to download and all the grammars. You can then check an individual grammar using: script/grammar-compiler add vendor/grammars/godot-vscode-plugin/
:
$ script/grammar-compiler add vendor/grammars/godot-vscode-plugin/
latest: Pulling from linguist/grammar-compiler
Digest: sha256:203514bbf04b83d3594245338fc08097301e54139ccec20ab05861302c23ab64
Status: Image is up to date for linguist/grammar-compiler:latest
docker.io/linguist/grammar-compiler:latest
3 errors found in new grammar 'repository `vendor/grammars/godot-vscode-plugin/` (from https://github.com/godotengine/godot-vscode-plugin)':
- Invalid regex in grammar: `source.gdscript` (in `syntaxes/GDScript.tmLanguage.json`) contains a malformed regex (regex "`\b(?<=for\s[\w]*\s)(in)\b`": lookbehind assertion is not fixed length (at offset 18))
- Invalid regex in grammar: `source.gdshader` (in `syntaxes/GDShader.tmLanguage.json`) contains a malformed regex (regex "`(?<=[.]\s*)[a-zA-Z_]\w`": lookbehind assertion is not fixed length (at offset 10))
- Invalid regex in grammar: `source.gdshader` (in `syntaxes/GDShader.tmLanguage.json`) contains a malformed regex (regex "`(?<=[.]\s*)(?:[xyzw]{2,4}|[rgba]`...": lookbehind assertion is not fixed length (at offset 10))
Compilation failed. Aborting
$
The compiler is really simple and can be found in the github/linguist repo here. It basically checks the grammar is using valid PCRE regexes (GitHub uses PCRE instead of oniguruma for performance reasons) and then "compiles" them into JSON or protobuf files.
That's fantastic, thanks for the write-up. When I get these fixes merged, I'll comment in github/linguist#3924 to let you know.
To do this, clone the repo and run script/bootstrap to download and all the grammars. You can then check an individual grammar using: script/grammar-compiler add vendor/grammars/godot-vscode-plugin/:
It might be worth mentioning to install all of linguist's dependencies, including bundler
, before trying to run the bootstrap script. I'm dumb enough that it took me about ten minutes to figure that out.
This issue is resolved by PR #416, but I don't believe everything has been propagated through Linguist to Github itself, yet.
@lildude should I create an issue over at linguist to try and track this?
Godot version
N/A
VS Code version
N/A
Godot Tools VS Code extension version
master
System information
N/A
Issue description
👋 I'm the lead maintainer of the https://github.com/github/linguist library which is used for language detection and providing the syntax highlighting for languages on GitHub.com, and we use this grammar.
Our grammar compiler has found several problems with the grammars which I thought I'd let you know about:
source.gdscript
(insyntaxes/GDScript.tmLanguage.json
) contains a malformed regex (regex "\b(?<=for\s[\w]*\s)(in)\b
": lookbehind assertion is not fixed length (at offset 18))source.gdshader
(insyntaxes/GDShader.tmLanguage.json
) contains a malformed regex (regex "(?<=[.]\s*)(?:[xyzw]{2,4}|[rgba]
...": lookbehind assertion is not fixed length (at offset 10))source.gdshader
(insyntaxes/GDShader.tmLanguage.json
) contains a malformed regex (regex "(?<=[.]\s*)[a-zA-Z_]\w
": lookbehind assertion is not fixed length (at offset 10))Steps to reproduce
View each of the regexes in turn in regex101 and note the errors:
https://github.com/godotengine/godot-vscode-plugin/blob/e7674c12adef55d13d17ee66b057a3f1789a2102/syntaxes/GDScript.tmLanguage.json#L298
https://regex101.com/r/TFZmub/1
https://github.com/godotengine/godot-vscode-plugin/blob/e7674c12adef55d13d17ee66b057a3f1789a2102/syntaxes/GDShader.tmLanguage.json#L345
https://regex101.com/r/JX03EY/1
https://github.com/godotengine/godot-vscode-plugin/blob/e7674c12adef55d13d17ee66b057a3f1789a2102/syntaxes/GDShader.tmLanguage.json#L349
https://regex101.com/r/2Zpu1I/1