Open lewis6991 opened 1 year ago
Would love to do this--and spent a lot of time trying to make it work--but I failed. The problem AFAIR is codeblock
termination can happen on any line.
In https://github.com/MDeiml/tree-sitter-markdown sections are represented structurally in the AST
tree-sitter-markdown has a custom scanner.c. Thus far tree-sitter-vimdoc has avoided a custom scanner, which helped a lot with development velocity. Of course, the door is open to exploring that now that things are mostly working.
Ideally tree-sitter itself would introduce a feature that makes things easier for grammars instead of needing a custom scanner. For example https://github.com/tree-sitter/tree-sitter/issues/160 would provide EOF to the grammar instead of making grammars do insane backflips to deal with that.
Would things change if we tighten the requirements to always have a terminating <
for codeblocks?
But it should be noted that tree-sitter-markdown also tried and failed and in the end had to switch to a two-pass strategy where one parser only parses the block structure, and a second parser does inline parsing of each individual block. (This works but has obvious performance implications.)
tighten the requirements to always have a terminating
<
for codeblocks?
Instead of "always", maybe only if the next block is a h1 or column_heading?
So this would be allowed:
foo >
code
bar >
code
<
but this would not be allowed:
foo >
code
=========
h1
This wouldn't result in a perfect AST but might be good enough.
In https://github.com/MDeiml/tree-sitter-markdown sections are represented structurally in the AST. This allows things like https://github.com/nvim-treesitter/nvim-treesitter-context to leverage this structure to provide contexts.
Proposal
Make the current
column_heading
orh1
node the beginning of a block and nest everything under until the nextcolumn_heading
orh1
.