Add support for contentRegex

atom / first-mate

TextMate helpers

http://atom.github.io/first-mate

MIT License

91 stars 57 forks source link

Add support for contentRegex #109

Closed Aerijo closed 5 years ago

Aerijo commented 6 years ago

Tree-sitter grammars appear to have a new property contentRegex. Support for this is already in Atom, all that's required for TextMate to benefit from it is to add in when constructing the grammar.

In particular, this will help differentiate similar grammars such as LaTeX & LaTeX Beamer. In this case, both share the same file extension, and both can potentially not match the first line regex. It is difficult / impossible to influence the score in a way that works in all cases.

With the change, the solution is just add contentRegex to Beamer. If it matches, it gets a edge over LaTeX. If it fails, it gets a penalty and LaTeX consistently gets applied.

Alternatively, Atom could be changed to give a penalty when the firstLineMatch fails, but this may have unwanted consequences.

lee-dohm commented 6 years ago

@maxbrunsfeld Would you take a look here since you're the expert on the contextRegex thing?

maxbrunsfeld commented 6 years ago

With Tree-sitter grammars, the contentRegex is just a javascript regex, not an oniguruma regex. I don't have super strong opinions on how it should behave with the TextMate grammars, but as written, this probably won't quite work because oniguruma regexes have a different API than regular regexes (.test vs .testSync).

Aerijo commented 6 years ago

@maxbrunsfeld Would it work if Atom pulled it out directly, and either stored it or added it to the constructed grammar?

lee-dohm commented 6 years ago

I'm not sure that this is going to translate correctly. Perhaps we should hold off on this for now?

lee-dohm commented 6 years ago

@maxbrunsfeld Would you state what changes are necessary here?

maxbrunsfeld commented 6 years ago

Options I see:

Merge this PR as-is. This will require some changes in Atom when we upgrade first-mate in Atom, because Atom expects contentRegex to be a RegExp (not an OnigRegExp). Specifically, Atom would need to have different logic around contentRegex depending on whether the grammar is a TreeSitterGrammar or a FirstMate.Grammar.
Change this code to instantiate a RegExp instead of an OnigRegExp. This would be simpler on Atom's side, but is maybe a bit inconsistent, because all of the other regexes in first-mate are oniguruma regexes?

Aerijo commented 5 years ago

@maxbrunsfeld I submitted a PR to Atom that will handle it as an Oniguruma regex. At first I was happily surprised it had a method named test, but unfortunately it was async so changes were required.

As noted there, it depends on this PR to add the contentRegex property to the grammar, and would therefore need to have the first-mate version bumped.

ashthespy commented 5 years ago

@Aerijo @maxbrunsfeld Any thing preventing this from being merged?

Aerijo commented 5 years ago

@ashthespy Don't think so, besides making sure the changes are coordinated and the tests pass on the other PR.

nathansobo commented 5 years ago

Thanks very much!

nathansobo commented 5 years ago

Published as first-mate@7.2.0.