microsoft / vscode-markdown-tm-grammar

VS Code built-in markdown extension's Textmate grammar
MIT License
64 stars 49 forks source link

Support scopes for code fence attributes #145

Open michaeltlombardi opened 1 year ago

michaeltlombardi commented 1 year ago

Currently, the grammar parses the attributes after the language ID for a code fence as a single token with the scope fenced_code.block.language.attributes.markdown:

```sh {higlight="content:~/$foo"}
cd ~/$foo


![Example parsing, showing the textmate scopes includes `fenced_code.block.language.attributes.markdown`](https://github.com/microsoft/vscode-markdown-tm-grammar/assets/14190564/4d8a19b7-b1ab-4f71-b6b7-5015c40d58c2)

It would be more readable/useful to be able to parse this a little further:

- `{` is the punctuation that begins the attributes
- `highlight` is an attribute name
- `=` is an assignment operator
- `"content:~/$foo"` is an attribute value and a string
- `}` is the punctuation that ends the attributes

There's some commonly used syntax where `.foo` indicates a CSS class and `#bar` indicates an HTML ID, but otherwise I think the primary scopes are:

- `{` and `}` for beginning and ending punctuation
- attribute names may be either standalone, like `{ disabled }` or name-value pair, like `{ attrName="attrValue" }`
- `=` is the assignment operator for an attribute name-value pair
- Values must be boolean `true`/`false`, numeric, or a quoted string[^1]

I don't know enough about scope naming definitions to have good suggestions for the scopes to define, I just know enough to have noticed this limitation for the grammar.

[^1]: There are a few Markdown parsers that support complex data types here, like arrays in `[]` or objects in `{}`, but I think that's probably too complex to get into, at least on a first pass.
reenberg commented 10 months ago

It is worth noting that MkDocs doesn't support that syntax. It is fairly lenient with the requirements of curly braces and period on the language class, but when the language definition is combined with other classes, then its quite strict:

the curly braces and dot are required for all classes, including the language class if more than one class is defined

An example, compliant with MkDocs would be

``` bash
# Some comment
echo "Hello World"
# Some comment
echo "Hello World"
# Some comment
echo "Hello World"
# Some comment
echo "Hello World"