Inject regex grammar - Githubissues

bennypowers commented 2 years ago

Would be nice if this grammar would inject the regex grammar in instances like:

function CheckRegexp()
  let l:count = 0
  g/^[^$,\"]/let l:count +=1
endfunction

such that ^[^$,\"] would parse up with the regex grammar.

Currently, it is monolithically a "pattern" node, with no embedded nodes.

vigoux commented 2 years ago

Hi, there was discussion of making the parser able to parse vim-specific regexes (as they differ quite drastically from "regular" regexes). The issue here is that the parsing of the regexes actually depends on multiple factors (such that 'magic').

This this needs a bit of discussion, but I am open to do such things.

In the mean time, injections can be configured in things like nvim-treesiter so you should open an issue there too.

sisrfeng commented 2 years ago

maybe this can help: https://github.com/statox/VimRegexConverter

sisrfeng commented 2 years ago

Can not find things relevant to "very magic" in nvim-treesiter

vigoux commented 2 years ago

I think you don't get what we are doing here: the action of parsing is to turn a flat table of bytes into a tree. The problem here is that, in order to turn the bytes into a tree, you need first to do a "lexing" pass, grouping the bytes into chunks with meaning.

In the case of tree-sitter, the lexing is said to be "contextual" that is the bytes may be interpreted differently depending on the part we are in the tree.

This is not really important in our case though, but here is the thing: you need to be able to determine what the next chunk of bytes is in order to parse the text correctly.

Now take the following regex written in a file:

\k+

Depending on the value of the 'magic' option, this can mean two things:

Any number of \k
Only one \k then +

Because of this, we cannot parse this correctly without more knowledge about the general state, which is impossible at our step (as we do this "statically").

vigoux / tree-sitter-viml

Inject regex grammar #96