Closed glebec closed 6 years ago
Examples of patterns I tried which do not seem to work (again, substitute ◆ with backtick):
◆◆◆(re|reason|reasonml)(\\s+[^`~]*)?$
\\G(re|reason|reasonml)(\\s+[^`~]*)?$
(?<=$\\W*)(re|reason|reasonml)(\\s+[^`~]*)?$
…and some other stuff too but which was flawed a priori, I was just experimenting to try and figure out the scope of the pattern matching.
Ah, I was close with the thought to use a lookbehind! Thank you @mjbvz.
I pushed a fix for this that uses a lookbehind as you note.
It's not perfect since it will still match:
```
`js
bla bla bla
```
To avoid that problem, I believe you need to inject the grammar into the top level markdown grammar instead of into the fenced code block rule. This is a bit more complicated since you have to handle tokenizing the fenced code block start and end markers too. Here it is for reference:
{
"fileTypes": [],
"injectionSelector": "L:text.html.markdown",
"patterns": [
{
"include": "#superjs-code-block"
}
],
"repository": {
"superjs-code-block": {
"begin": "(^|\\G)(\\s*)(\\`{3,}|~{3,})\\s*(?i:(superjs)(\\s+[^`~]*)?$)",
"name": "markup.fenced_code.block.markdown",
"end": "(^|\\G)(\\2|\\s{0,3})(\\3)\\s*$",
"beginCaptures": {
"3": {
"name": "punctuation.definition.markdown"
},
"5": {
"name": "fenced_code.block.language"
},
"6": {
"name": "fenced_code.block.language.attributes"
}
},
"endCaptures": {
"3": {
"name": "punctuation.definition.markdown"
}
},
"patterns": [
{
"begin": "(^|\\G)(\\s*)(.*)",
"while": "(^|\\G)(?!\\s*([`~]{3,})\\s*$)",
"contentName": "meta.embedded.block.superjs",
"patterns": [
{
"include": "source.js"
}
]
}
]
}
},
"scopeName": "markdown.superjs.codeblock"
}
An example like this (where ◆ = backtick for rendering purposes):
Matches the
re
for e.g. the ReasonML language which uses an identical injection file as this example repo. The desired behavior would only be to match language identifiers appearing after the triple backticks on the same line:The problem would seem to be that the regex in the
begin
property for this example appears to be too lax.https://github.com/mjbvz/vscode-fenced-code-block-grammar-injection-example/blob/dd6961fce89362b623ad83637a6050f89bb92f32/syntaxes/codeblock.json#L11
I tried playing with it a bit, but I was stymied as patterns which I thought would apply (matching against triple backticks, for example) failed to match appropriately. Accordingly I have a few questions:
begin
andend
patterns actually applied in VSCode? Are they applied to entire files, or just to the contents of code fences, or something else?\G
pattern.Let me know if I am off-base, but since many language extensions are using this repo as a template without carefully analyzing how exactly it works, it seems worthwhile to make the pattern as bulletproof as is reasonably possible. I am seeing a lot of false positives.
(This issue is essentially a superclass of #5.)