Open dvzubarev opened 2 years ago
Hi, There is
make-range!
predicate implemented in nvim-treesitter here. It is used for implementing some evil text objects: example. Those queries don't work in emacs implementation of evil text objects.
Can you describe what this predicate does? It's not immediately clear from looking at the implementation and the example.
I'd like to try to add it. Any tips where to start?
The currently supported predicates (eq?
, not-eq?
, match?
, not-match?
) are implemented at the layer of the Rust crate tree-sitter
, by the function satisfies_text_predicates. Additional custom predicates should be implemented at the elisp-tree-sitter
layer, in query.rs, by additional processing on top of the results returned by cursor.captures()
and cursor.matches()
.
The instructions for local dev setup are here.
@ubolonton The directive takes to nodes and creates a data structure describing the range from the beginning of the first node to the end the second node and stores them as meta data next to the query result. It is currently not used out side of nvim-treesitter-textobjects. The data structure has API compatibility in Lua with regular tree-sitter nodes.
You could allow users to inspect unknown patterns in the query results they are getting or allow them to register their own predicates or directives to post-process the query results once they are exposed to emacs Lisp. In Neovim, such user-defined directives or predicates can be registered via Lua functions and are directly applied onto the query results (e. g. here https://github.com/theHamsta/neovim/blob/acacf5151bb9d6d4b8fc0f0ba6a1a6cccaaa0b4f/runtime/lua/vim/treesitter/query.lua#L414-L429)
@dvzubarev Really cool project!
I spent some time today on this, but was not really able to fully figure out how I should approach this. I don't think my understanding of the tree-sitter
lib is anywhere close to useful yet 🤷🏼♂️ .
@ubolonton , just wanted to check if you had some pointers on how I should approach this or resources I can refer.
@meain please not that make-range!
was created just for the needs of nvim-treesitter-textobjects since there is no obvious way to select multiple nodes using tree-sitter queries. It was not adopted in upstream neovim.
Thanks @theHamsta , I did take a look at the implementation in nvim-treesitter-textobjects
. Unfortunately elisp-tree-sitter
as of now does not expose anything that we can use to add "directive_handlers" nor does it expose the patterns in the query into elisp if I understand correctly.
I briefly looked more into nvim-treesitter
's make-range!
.
It seems to me that, for each pattern, the data extraction logic depends on the capture names, not the structure of the pattern. For example:
(#make-range! "parameter.outer" @_start @parameter.inner)
(#make-range! "parameter.outer" @parameter.inner @_end)
If that's the case, the predicates are boilerplate which can be eliminated by specifying that logic at the level of the text-object library. This is a REPL snippet that illustrates the idea:
(with-current-buffer "example.py"
(seq-map (lambda (match)
(pcase-let* ((`(,_ . ,captures) match)
(captures (seq-into captures 'list))
(start (map-elt captures '_start))
(inner (map-elt captures 'parameter.inner)))
(cons (tsc-node-start-position start)
(tsc-node-end-position inner))))
(tree-sitter-debug-query
"(parameters
\",\" @_start .
[
(identifier)
(tuple)
(typed_parameter)
(default_parameter)
(typed_default_parameter)
(dictionary_splat_pattern)
(list_splat_pattern)
] @parameter.inner)"
:matches)))
Note: Making short-lived node objects to retrieve their properties puts a lot of stress on the GC, but that's an orthogonal discussion. tree-sitter-hl
uses an internal API which avoids that. Such approach can be generalized. The results will probably look similar to the new tree-traversal APIs.
(The snippet above uses tree-sitter-debug-query
to quickly illustrate the idea, libraries should use tsc
APIs.)
Thanks @ubolonton , just had a few queries. I was not able to find out where/if we are exposing general_predicates
which I believe would help with figuring out these "unhanded" items.
This might be a dumb question, but should I be worrying about scoping the make-range
queries? For example in something like below, only the first (_start
+ parameter.inner
) should be picked up for parameter.outer
if I understand correctly.
((parameters
"," @_start .
[
(identifier)
(tuple)
] @parameter.inner
)
(#make-range! "parameter.outer" @_start @parameter.inner))
((something-else ; <-- not parameteres
"," @_start .
[
(identifier)
(tuple)
] @parameter.inner
)
(#make-range! "parameter.middle" @_start @parameter.inner)) ; <-- this is not parameter.outer
I was not able to find out where/if we are exposing
general_predicates
which I believe would help with figuring out these "unhanded" items.
They are not exposed currently. Can you explain how they would help?
should I be worrying about scoping the
make-range
queries? For example in something like below, only the first (_start
+parameter.inner
) should be picked up forparameter.outer
if I understand correctly.((parameters "," @_start . [ (identifier) (tuple) ] @parameter.inner ) (#make-range! "parameter.outer" @_start @parameter.inner)) ((something-else ; <-- not parameteres "," @_start . [ (identifier) (tuple) ] @parameter.inner ) (#make-range! "parameter.middle" @_start @parameter.inner)) ; <-- this is not parameter.outer
Do you have a concrete example for this? AFAICT, text objects don't need that generality. They have only inner
and outer
.
Do you have a concrete example for this? AFAICT, text objects don't need that generality. They have only inner and outer.
I don't really have a concrete example, was just wondering if this would be something that I would have to handle.
They are not exposed currently. Can you explain how they would help?
As of now I am just reading the queries file which is pulled from the nvim-treesitter-textobjects
package directly to tsc-make-query
and was planning to keep this flow as is if possible. I am guessing I would have to parse this info(make-range
arguments) out in elisp otherwise.
Do you have a concrete example for this? AFAICT, text objects don't need that generality. They have only inner and outer.
I don't really have a concrete example, was just wondering if this would be something that I would have to handle.
Looks like no. Please let me know when you encounter an example otherwise.
They are not exposed currently. Can you explain how they would help?
As of now I am just reading the queries file which is pulled from the
nvim-treesitter-textobjects
package directly totsc-make-query
and was planning to keep this flow as is if possible. I am guessing I would have to parse this info(make-range
arguments) out in elisp otherwise.
Exposing that function isn't difficult. I'm trying to understand how that information would help with the text-object use case.
Exposing that function isn't difficult. I'm trying to understand how that information would help with the text-object use case.
This might be specific to my usecase, but I pretty much load a queries file as is with all the text-objects in meain/evil-textobj-tree-sitter. Is there some way I could parse out just the predicates alone without that to compute the start and end for them separately?
I understand that you want to extract the arguments to the make-range!
predicate, for each pattern. My question is, why.
From the look of them, the predicates look redundant and don't add information, unless nvim-treesitter-textobjects
doesn't have rules to constraint the capture names. (In which case that would be a potential improvement to that project.)
Maybe I am missing something here but the definition of parameter.outer
is just purely in make-range
.
((parameters
"," @_start .
[
(identifier)
(tuple)
(typed_parameter)
(default_parameter)
(typed_default_parameter)
(dictionary_splat_pattern)
(list_splat_pattern)
] @parameter.inner
)
(#make-range! "parameter.outer" @_start @parameter.inner))
Or are you saying that I don't really have to look into the make-range
but rather just compute start and end from first and last item in a match entry? If this the the case, I would still need to get make-range
as I will need to pull out the name of the object from that.
On a related note, I tried grouping sibling nodes but I'm not sure if I am doing it right. @parameter.outer
is just matching the comma. This is the same in tree-sitter playground (image) as well.
I was thinking that maybe I could rewrite the queries this way if this is a feasible option
@theHamsta I remember you mentioning that tree-sittter
does not have an obvious way to select multiple items. I just saw this issue on on the nvim side and was wondering if grouping sibling nodes is not possible/intended to be used this way.
Grouping sibling nodes only allows you to specify a order of the nodes. It would allow you to set a capture on each and every of the mentioned subnodes on which you could create a union on once you iterate over all the subnodes which is similar to the start/end patternas queries only can return nodes. The motivation to use custom directives here was that the actual plugin does need to handle even more and more special ways how you describe ranges but that you would just register a new Post-Processing function that would work on all queries that would work on all query users. Also users can define their own predicates and directives which would only be used in their config.
The make-range! is a bad example as it's kind of deprecated as in the end upstream Neovim uses a different way to represent query results. The result of make-range!
is a range that can ducktype a normal node and thus be handled by applications without code change. There's also a bug that currently prevents Neovim to return multiple nodes with the same capture from (_)* @foo
. So make-range! could be avoided by just usong the capture you want to refer to multiple times with the semantic of combining the node ranges. In refactor of nvim-treesitter-textobjects we might use this to avoid make-range!.
Other comment, in the original issue commenf the implemention of make-range is linked. This is kind of misleading if how predicates/directives work today in Neovim. This legacy predicate is still hard-coded in our plugin. Modern predicates/directives can be registered by a Neovim API function and don't require changes to a plugins code.
Or are you saying that I don't really have to look into the
make-range
but rather just compute start and end from first and last item in a match entry? If this the the case, I would still need to getmake-range
as I will need to pull out the name of the object from that.
Not the first and last captures, but specifically-named captures. A text object x
can be inner
/outer
, and can be located by either a node, or a pair of start
/end
nodes, so there are a maximum of 6 names:
x.inner
, x.inner.start
, x.inner.end
.x.outer
, x.outer.start
, x.outer.end
.If the text-object library imposes this constraint, the make-range!
predicate becomes redundant. That's the improvement I was suggesting nvim-treesitter-textobjects
can make.
tree-sittter
does not have an obvious way to select multiple items
IIUC, this is (was?) an issue with the NeoVim binding. In this ELisp binding, we don't have that issue, since we have tsc-query-matches
.
in the end upstream Neovim uses a different way to represent query results.
@theHamsta I'm curious, what does it use now?
If the text-object library imposes this constraint, the
make-range!
predicate becomes redundant. That's the improvement I was suggestingnvim-treesitter-textobjects
can make.
Ahh, this makes sense.
This is the PR which would fix that we can have multiple nodes per capture and match: https://github.com/neovim/neovim/pull/17099.
Right now you can register predicates and directives: https://github.com/theHamsta/neovim/blob/3b0a0c6ca6c5f4e719028be2dba11853ffec6b6b/runtime/lua/vim/treesitter/query.lua#L347-L371 they work then directly on the match objects https://github.com/theHamsta/neovim/blob/3b0a0c6ca6c5f4e719028be2dba11853ffec6b6b/runtime/lua/vim/treesitter/query.lua#L569 all other predicates and directives are registered in our plugin https://github.com/nvim-treesitter/nvim-treesitter/blob/f735498a645e1a2aca7a0cfdaa2d7f8cec543846/lua/nvim-treesitter/query_predicates.lua make-range!
works on a different data structure this is why it has no implementation that works directly on the match https://github.com/nvim-treesitter/nvim-treesitter/blob/f735498a645e1a2aca7a0cfdaa2d7f8cec543846/lua/nvim-treesitter/query_predicates.lua#L103-L102
There are also directives implemented in the core for languages injection which can trim characters for language injection they mostly just calculate meta data. https://github.com/theHamsta/neovim/blob/3b0a0c6ca6c5f4e719028be2dba11853ffec6b6b/runtime/lua/vim/treesitter/query.lua#L307-L345 So our directive mostly just attach some Lua objects to the match that can be used by downstream applications.
@theHamsta How exactly is @function.outer.start
used here. Is it merged with @function.outer
in lua side? I was not able to find anything specific from a quick scan in the codebase.
@function.outer.start
are optional decorations for the textobject like doctrings or template declarations. They will be added to the textobject range if present.
The original issue of make-range
has been taken care of in downstream (https://github.com/meain/evil-textobj-tree-sitter/pull/38) . Ended up rewriting the queries to have <node>._start
and <node>._end
and merging it in elisp. We can close this issue if necessary, or maybe rename it to be a general discussion around how to implement custom predicates as there is some useful discussion here.
Hi, There is
make-range!
predicate implemented in nvim-treesitter here. It is used for implementing some evil text objects: example. Those queries don't work in emacs implementation of evil text objects. I'd like to try to add it. Any tips where to start?