Open Tronic opened 5 years ago
Monarch was contributed initially by Daan Leijen and I am just maintaining it at this point, but it looks like our yaml Monarch parser manages to overcome the indentation change by doing these kind of tricks:
The multiStringContinued.$1
means that the state is entered and the matched text is passed in as an argument to that state, which is then available at $S2
.
TextMate format recently added something similar that is actually now used for YAML and Nim in VSCode. Still, a solution without such workaround would be preferred.
@Tronic I also maintain vscode-textmate
, which is the TextMate grammar engine that executes in VS Code and I'm not 100% sure, what has TextMate recently added that is now used in YAML in VS Code?
VS Code uses the YAML grammar from https://github.com/textmate/yaml.tmbundle and that hasn't had any commits since 2017
Around the time when I reported this there was a discussion about it that I am no longer able to find.
IIRC, VSCode did not correctly highlight YAML text blocks but there already was a working implementation elsewhere and some time later this year VSCode was updated to handle it properly. Possibly VSCode until then used a module maintained by some other project that has now deleted itself.
In any case, the feature I was referring to was matching indentation with regex groups captured from the initial line, which is not mentioned in Textmate specs but appears to work in current implementation.
being able to use a previously captured indentation whitespace string in the end regex or otherwise to determine whether the block needs to be popped
There is a way to do this, you can grab the state value that was passed with the indentation level you want (as @alexdima suggested) and use it as a case comparison on a match for leading whitespace. Something like this:
[ /[ \t\r\n]+/, { cases: { '~$S2 *': { token: 'white' }, '@default': { token: '@rematch', next: '@popall' } } } ]
This still doesn't fix it but provides you with more flexibility to play around with indentation levels than directly comparing the previous indentation, like the current yaml implementation does. I hope this might help!
I am working on creating regular expressions to identify multi-line string which not enclosed with backticks or quotes. Need to identify the multi-line string based on indentation.
Ex: object objectname property1: propertyvalue1 source = **let Source = Sql.Database(S dbo_FactCurrencyR
in
#"Renamedghvhv C
#"Renamed Columns"**
property3: propertyvalue3
Referred, https://github.com/microsoft/monaco-languages/blob/0ed9a6c3e90a24375fab54f7205fb76ce992f117/src/yaml/yaml.ts#L159-L173 If all the lines are with same indentation, then only it is considering as a part of multi-line, but as per the example greater indentation lines are also part of multi-line string. How to achieve this?
@alexdima , @henriquetmm
A few languages use indented blocks for structure, and thus don't always have end markers that could be directly matched against. In particular, YAML presents multi-line string literals (scalars in YAML terminology) in this way. The problem is that neither TextMate nor Monarch appear to support matching the change of indent, and consequently YAML's handling is quite broken in most editors. (well, at least Github gets it right)
This could be solved primarily by
Unless I've overlooked something, the current best solution is generating about 40 duplicate rules with all typically used indentation widths in their begin/end regular expressions.