godotengine / godot-vscode-plugin

Godot development tools for VSCode
MIT License
1.53k stars 151 forks source link

Grammar contains several invalid regex with lookbehind assertions that are not a fixed length #415

Closed lildude closed 1 year ago

lildude commented 2 years ago

Godot version

N/A

VS Code version

N/A

Godot Tools VS Code extension version

master

System information

N/A

Issue description

👋 I'm the lead maintainer of the https://github.com/github/linguist library which is used for language detection and providing the syntax highlighting for languages on GitHub.com, and we use this grammar.

Our grammar compiler has found several problems with the grammars which I thought I'd let you know about:

Steps to reproduce

View each of the regexes in turn in regex101 and note the errors:

Calinou commented 2 years ago

cc @DaelonSuzuka

PS: I didn't know this repository was used as a basis for Linguist. This is good to know, as we'll need to mention this in the contributing guidelines since it has a few implications.

lildude commented 2 years ago

We've been using it for GDScript since 2020 thanks to https://github.com/github/linguist/pull/5000 😁

DaelonSuzuka commented 2 years ago

Thanks for reporting this. I'll try to look into these issues this weekend.

@lildude Is there documentation somewhere for how I can run your grammar compiler myself, so I can try and catch issues sooner?

lildude commented 2 years ago

@lildude Is there documentation somewhere for how I can run your grammar compiler myself, so I can try and catch issues sooner?

We don't have any docs other than the CONTRIBUTING.md for adding/replacing grammars and the steps I take when putting together a release.

The compiler doesn't have functionality to check just a specific grammar, but you can effectively do this by attempting to add it. This will attempt to compile it and report any problems. To do this, clone the repo and run script/bootstrap to download and all the grammars. You can then check an individual grammar using: script/grammar-compiler add vendor/grammars/godot-vscode-plugin/:

$ script/grammar-compiler add vendor/grammars/godot-vscode-plugin/
latest: Pulling from linguist/grammar-compiler
Digest: sha256:203514bbf04b83d3594245338fc08097301e54139ccec20ab05861302c23ab64
Status: Image is up to date for linguist/grammar-compiler:latest
docker.io/linguist/grammar-compiler:latest
3 errors found in new grammar 'repository `vendor/grammars/godot-vscode-plugin/` (from https://github.com/godotengine/godot-vscode-plugin)':
- Invalid regex in grammar: `source.gdscript` (in `syntaxes/GDScript.tmLanguage.json`) contains a malformed regex (regex "`\b(?<=for\s[\w]*\s)(in)\b`": lookbehind assertion is not fixed length (at offset 18))
- Invalid regex in grammar: `source.gdshader` (in `syntaxes/GDShader.tmLanguage.json`) contains a malformed regex (regex "`(?<=[.]\s*)[a-zA-Z_]\w`": lookbehind assertion is not fixed length (at offset 10))
- Invalid regex in grammar: `source.gdshader` (in `syntaxes/GDShader.tmLanguage.json`) contains a malformed regex (regex "`(?<=[.]\s*)(?:[xyzw]{2,4}|[rgba]`...": lookbehind assertion is not fixed length (at offset 10))

Compilation failed. Aborting

$

The compiler is really simple and can be found in the github/linguist repo here. It basically checks the grammar is using valid PCRE regexes (GitHub uses PCRE instead of oniguruma for performance reasons) and then "compiles" them into JSON or protobuf files.

DaelonSuzuka commented 2 years ago

That's fantastic, thanks for the write-up. When I get these fixes merged, I'll comment in github/linguist#3924 to let you know.

DaelonSuzuka commented 2 years ago

To do this, clone the repo and run script/bootstrap to download and all the grammars. You can then check an individual grammar using: script/grammar-compiler add vendor/grammars/godot-vscode-plugin/:

It might be worth mentioning to install all of linguist's dependencies, including bundler, before trying to run the bootstrap script. I'm dumb enough that it took me about ten minutes to figure that out.

DaelonSuzuka commented 1 year ago

This issue is resolved by PR #416, but I don't believe everything has been propagated through Linguist to Github itself, yet.

@lildude should I create an issue over at linguist to try and track this?

lildude commented 1 year ago

No. Please don't. It'll happen when it happens as per the docs here and here.