Closed dawsonc623 closed 1 year ago
I agree, I believe what's happening here is the while is eating all the characters, e.g. #3
in the stuff below. (I'm just a random grammar/syntax maker BTW) That said I think it is still possible to nest them.
Here's a copy-paste of my personal documentation on while, which I think partly confirms what you saw in the source code.
The textmate "while" key has almost no documentation. I'm writing this to explain what little I know about it.
The "while" key is stronger than the "end" pattern, as soon as the while is over, it stops and most importantly, it cuts off any ranges that are still open. This is incredibly important because almost nothing else in textmate does this, and it is useful for stopping broken syntax.
I believe it was designed to match things like the python intentation-based block.
However, there are some caveats.
(end)
I have no idea if any of that is intended behavior. That said, I think your issue is the first while consumes all the text on the line, so there's nothing left for the included patterns to even match against.
I would guess that changing it to match just the white space of one indent level, and then including bar as a regular include (rather than a whileCapture) would fix it. I don't think I've ever gotten the while captures to work, I usually just use a lookahead in the while, and then include a pattern that does the actual matching
Thanks for the response. Interestingly, Python does not use while
for this case (at least not the built-in support). Only a handful of built-ins use while
at all, and Markdown is the only one that does extensively (apart from searchResult
, which I assume is, well, related to the search feature).
Anyway, I went with capturing the full line and passing it into the whileCaptures
because generally how I have seen that work in other constructs (say, match
/captures
) is that the include
s can process the incoming as they need. That said, some of your thoughts triggered a different line of thinking, and I was able to do some adjustments to the test grammar to match the test input as I would expect. I will attempt to port that over to my real grammar and report back if it works.
So, it worked on my full grammar (well, mostly, but I think where it is acting up is unrelated to this). I extended the test grammar and input to cover more nesting and cyclical cases, too, which worked.
The fix was indeed to step back from the whileCaptures
approach and instead have while
match just the captured level of indentation from the begin
and a look-ahead to see if the next character was another whitespace. Then, putting patterns
at the top level worked. The extended example:
Grammar:
{
"name": "Foo",
"scopeName": "source.foo",
"patterns": [
{ "include": "#foo" }
],
"repository": {
"foo": {
"name": "meta.foo",
"begin": "(\\s*)foo.*",
"while": "\\1(?=\\s)",
"patterns": [{ "include": "#bar" }, { "include": "#hmm" }]
},
"bar": {
"name": "meta.bar",
"begin": "(\\s*)bar.*",
"while": "\\1(?=\\s)",
"patterns": [{ "include": "#baz" }]
},
"baz": {
"name": "meta.baz",
"begin": "(\\s*)baz.*",
"while": "\\1(?=\\s)",
"patterns": [{ "include": "#foo" }, { "include": "#hmm" }]
},
"hmm": {
"name": "meta.hmm",
"begin": "(\\s*)hmm.*",
"while": "\\1(?=\\s)"
}
}
}
Input:
foo
bar
baz
foo
bar
baz
hmm
baz
baz
hmm
hmm
hmm
bar
baz
baz
biz
foo
hmm
The token inspector confirmed this works as I would expect, so regardless of what is going on with my full grammar I think the original issue was more in terms of my understanding of whileCaptures
(or apparently lack thereof) than the overall nested concept. Because of that, I am closing this issue under the assumption the wonkiness in my full grammar is not quite this issue either.
Given the following grammar and input, I am finding an issue where the top-level
begin
/while
rules apply as expected (thefoo
set, in the example), but the nestedwhile
ones do not seem to work. The second level's (bar
in the example)begin
works as confirmed by the token inspector, but itswhile
matches do not seem to apply given the none of the applicable lines are given the provided scope (meta.bar
in the example), nor is the next level (baz
in the example) ever applied (not even thebegin
applies).Grammar:
Input:
Note, my end goal is to do processing on a whitespace important language where some rules only apply to lines nested "within" certain sections. In the example,
bar
's rules are only applicable withinfoo
sections, which is triggered by a line whose first non-whitespace characters amount tofoo
, and only the following lines that contain the amount of whitespace proceedingfoo
on thebegin
line plus at least one more are considered part of thatfoo
section (ergo, thebiz
line is "nothing" in the given input and grammar). The same nesting relationship exists betweenbar
andbaz
, and theoretically more nesting relationships (including cyclical;foo
inside ofbaz
, for example) could exist ad infinitum.Intuitively, I would expect the
while
rules inbar
to apply, but based on my testing and what I think I found in the source code (admittedly I only spent about an hour actually producing this example and looking at the source code) it seemsfoo
's apply first and seem to "eat" the line without giving it over tobar
for its own continuation. Indeed, the way the rules stack is built and applied insrc/grammar/tokenizeString.ts
starting on line 343 looks to be designed to "reverse" the stack and apply each level fully before moving on (based both on the code and comment above the function at line 331).If supporting nested
begin
/while
rules is desired, depending on Microsoft's prioritization of it and however external open source contributions are handled I am willing to take a look at the issue myself as I am blocked by it currently.