tmbundle: Make tmLanguage compatible with PCRE2

lildude commented 2 years ago

👋 I'm the lead maintainer of the https://github.com/github/linguist library which is used for language detection and providing the syntax highlighting for languages on GitHub.com, and we use this grammar.

Our grammar compiler has found several problems with your grammar which I thought I'd let you know about.

These regexes have quite a few problems as you can see in the regex101 link after each:

https://github.com/marko-js/marko-tmbundle/blob/60ded48bea6d6eccea81e858168e20e0b1b78b01/syntaxes/marko.tmLanguage.json#L779

https://regex101.com/r/pSG73T/1

https://github.com/marko-js/marko-tmbundle/blob/60ded48bea6d6eccea81e858168e20e0b1b78b01/syntaxes/marko.tmLanguage.json#L108

... and repeated again at:

https://github.com/marko-js/marko-tmbundle/blob/60ded48bea6d6eccea81e858168e20e0b1b78b01/syntaxes/marko.tmLanguage.json#L130

https://regex101.com/r/NlVs41/1

These are the errors our compiler reported:

Invalid regex in grammar: text.marko (in syntaxes/marko.tmLanguage.json) contains a malformed regex (regex "(?=[,;\](]|/>|(?<=[^=])>|(?<!(?:...": nothing to repeat (at offset 105))
Invalid regex in grammar: text.marko (in syntaxes/marko.tmLanguage.json) contains a malformed regex (regex "(?=[,;\]]|/>|(?<=[^=])>|(?<!(?:^...": nothing to repeat (at offset 104))
Invalid regex in grammar: text.marko (in syntaxes/marko.tmLanguage.json) contains a malformed regex (regex "(?=[,;\]]|/>|(?<=[^=])>|(?<!(?:^...": nothing to repeat (at offset 104))

DylanPiercey commented 2 years ago

Will look shortly. Thanks for the report!

DylanPiercey commented 2 years ago

@lildude from what I can tell the regex functions correctly. The validator you are using is using PCRE2 however my understanding was that the regex's in tmgrammars are intended to be handled by oniguruma. Is this not the case for linguist?

lildude commented 2 years ago

Is this not the case for linguist?

No. GitHub uses PCRE for grammar parsing for performance reasons.

marko-js / marko-tmbundle

tmbundle: Make tmLanguage compatible with PCRE2 #19