Open dead-claudia opened 7 years ago
Thought about this for a while and I think I have a possible solution: adding a new key, something like alwaysMatchEndPattern
which is false
by default. When true, it will always match the end pattern, even if it would be otherwise matched by another pattern.
It's ironic that most of the time, we complain about not having enough context when developing grammars, but in this case, we're hit by the problem of knowing too much context.
@50Wliu
Thought about this for a while and I think I have a possible solution: adding a new key, something like allowsMatchEndPattern which is false by default. When true, it will always match the end pattern, even if it would be otherwise matched by another pattern.
SGTM. Is my understanding correct in that this would be executed before the child grammar is run, so the child grammar can be run through just a predetermined slice?
It's ironic that most of the time, we complain about not having enough context when developing grammars, but in this case, we're hit by the problem of knowing too much context.
Yeah...really unusual. :laughing:
Yeah, at least I hope so. I haven't looked at the implementation to see if it can be done reasonably though.
I am curious if this issue isn't better specified as issues with the offending grammars.
This issue can also be reproduced using a match
in the delegated grammar, instead of a begin
and end
, see: https://github.com/burodepeper/language-markdown/issues/77#issuecomment-272770554 (this is similar to the optional parameter issue reported with JavaScript)
The offending grammar in this case is language-yaml
which has the following rule (https://github.com/atom/language-yaml/blob/master/grammars/yaml.cson#L299-L302) which matches, well, unquoted strings. I am not too familiar with YAML, and there is probably a use for it, but it seems like a tricky thing altogether if you ever want to be able to 'escape' the YAML grammar again.
I can understand the replications with comments, and while they serve their purpose to showcase the issue, I think they're valid situations, because they do what they're supposed to do; the broken syntax highlighting shows that there's an error in your code, and above all, something that's easily fixable. Syntax highlighting is supposed to help you with this.
The issue with the optional parameter in JavaScript embedded in Markdown is a lot worse. So is the issue with YAML. It is valid code that appears to have errors. And I might be naive, especially since there are most likely a lot more of these issues, but I think they have to be fixed in their respective grammars.
the broken syntax highlighting shows that there's an error in your code, and above all, something that's easily fixable.
The problem is that it isn't invalid code.
<script>console.log('hi!'); //</script>
is perfectly valid HTML, for example. HTML will always end the script at a closing tag, even if it would otherwise be commented by JavaScript.
And for Markdown...
```js console.log('hi!'); /* ```
is also valid, because GFM always ends at the triple backticks, much like HTML.
@50Wliu You are probably right, and now I understand your alwaysMatchEndPattern
solution. Thanks for clearing that up.
Status?
No progress. Implementing this is most likely way beyond the scope of my knowledge, and the rest of the team is busy working on other initiatives, such as faster startup time and improved editor rendering.
Making progress. It's been slow since I've had to console log my way through every step, but here's where I've gotten to so far. Note that I'll be using the following example (language-gfm + language-javascript):
```js
/*
``` <-- THIS LINE
findNextMatch
starts the process to, well, find the next match: https://github.com/atom/first-mate/blob/e06b550732b30e7d50991c6cab87b238a7282eeb/src/rule.coffee#L55getScanner
uses the base grammar's included patterns: https://github.com/atom/first-mate/blob/e06b550732b30e7d50991c6cab87b238a7282eeb/src/rule.coffee#L36. This is where the problem occurs. language-gfm's only included pattern at this point is the ending */
from language-javascript.findNextMatch
proceeds to try to find a */
match and has no idea that language-gfm is also looking for ```
.So it looks like I might know enough now to try implementing the alwaysMatchEndPattern
option.
(Also, the reason it took me so long to get here was because I initially missed the getScanner
step, since it seems fairly innocuous, and spent way too long console logging the internals of scanner.coffee
before realizing my mistake)
Progress:
Unfortunately as you can probably see we aren't returned to the language-gfm grammar. At least we broke out of the comment though.
Super ugly implementation right now though. I'll try to clean up some of the logic and publish a PR so more experienced people can take a look at it.
Took a lot of persistence after implementing alwaysMatchEndPattern
because of the way the tags are matched to the scopes. Also didn't help that the assert mechanism for reporting inconsistent tags throws on my computer...
So sorry to throw a wrench into this, but a fix for this should probably be extended to also cover the scenario of isolating child grammars, so their $self
doesn't reference the parent grammar. Filed https://github.com/atom/first-mate/issues/112 separately, but it might belong in this bug instead.
Description
There is no way to scope a grammar capture for specific grammar rules, leading to very odd, buggy-looking syntax highlights.
Steps to Reproduce
Set the language to any that has a rule that features both of the following.
language-gfm
andlanguage-javascript
both fulfill this requirement in their grammars.begin
andend
capturepattern
that delegates to another language that also has a rule containing both abegin
andend
capture. (It doesn't have to delegate, though.)Create something that matches that rule. For GFM, a code block tagged with
css
would suffice, and for JS, a tagged template starting withhtml
would work.Within that block, type something that would match the
begin
half of any rule in the other language that includes both abegin
andend
. Do not include the end. For GFM or JS, just start a comment in CSS or HTML, respectively.Observe very odd syntax highlighting. It looks like a bug because it can't be contained, and repros 100% of the time.
If you followed the above repro, the code might look like this pre-syntax highlighting:
GFM -> CSS
JS -> HTML
Here's a few screenshots of other code.
language-gfm
->language-javascript
:language-javascript
->language-html
Versions
OS: Ubuntu GNOME 14.04 (although it's been repro'd in Windows by others) Atom: v1.12.5 (been going on for quite a while)
Additional Information
Initially reported in atom/language-gfm#171, but I filed it here because it's language-agnostic.