emacs-tree-sitter / elisp-tree-sitter

Emacs Lisp bindings for tree-sitter
https://emacs-tree-sitter.github.io
MIT License
815 stars 73 forks source link

`tsc-current-field` doesn't work against starting node #273

Closed mathrick closed 3 months ago

mathrick commented 4 months ago

It seems that tsc-current-field always returns nil when run against the cursor's starting node, ie.

(tsc-current-field (tsc-make-cursor node)) ; Always returns nil

This is not the case when navigating to other nodes via the cursor. Here's a repro that demonstrates it:

(let* ((parser (tsc-make-parser))
       (parsed (progn
                 (tsc-set-language parser (tree-sitter-require 'python))
                 (tsc-root-node (tsc-parse-string parser "def dupa(foo):"))))
       (parent (tsc-get-nth-child parsed 0))
       (cursor (tsc-make-cursor parent)))
  (tsc-goto-first-child cursor)
  (cl-loop for i from 0 to (1- (tsc-count-children parent))
           for child = (tsc-get-nth-child parent i)
           for child-cur = (tsc-make-cursor child)
           do (progn
                (tsc-reset-cursor child-cur child)
                (message "Traversed node is: %s" (tsc-node-to-sexp (tsc-current-node cursor)))
                (message "Direct node is: %s" (tsc-node-to-sexp (tsc-current-node child-cur)))
                (message "Traversed field is: %s" (tsc-current-field cursor))
                (message "Direct field is: %s" (tsc-current-field child-cur))
                (message "-----"))
           do (tsc-goto-next-sibling cursor)))

This results in the following output in *Messages*:

Traversed node is: ("def")
Direct node is: ("def")
Traversed field is: nil
Direct field is: nil
-----
Traversed node is: (identifier)
Direct node is: (identifier)
Traversed field is: :name
Direct field is: nil
-----
Traversed node is: (parameters (identifier))
Direct node is: (parameters (identifier))
Traversed field is: :parameters
Direct field is: nil
-----
Traversed node is: (":")
Direct node is: (":")
Traversed field is: nil
Direct field is: nil
-----
Traversed node is: ("_newline")
Direct node is: ("_newline")
Traversed field is: :body
Direct field is: nil
-----

Notice that the output is identical, except for the fields.

Version information:

Ubuntu 22.04
GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0) of 2020-09-19
tsc-dyn--version "0.18.0"
mathrick commented 4 months ago

This is apparently an issue for all traversal utils, for example:

;; This will also always return nil
(defun node-field (node)
  (block nil
    (tsc-traverse-do ([field] node)
     (return field))))
ubolonton commented 4 months ago

I think it's the same issue as https://github.com/tree-sitter/tree-sitter/issues/567. Basically the cursor can't walk out of the starting node, so it can't know which field it's on within the parent node. We definitely should document this better.

mathrick commented 3 months ago

Hmm, is it a shortcoming in the underlying treesitter API, or is there some functionality that's present in the base API but not exposed by tree-sitter.el? Because as things are right now, there's no straightforward way to get a node's field without gymnastics involving cursors and weird edge cases (see also #274). Something like tsc-node-field sounds like it should exist.

mathrick commented 3 months ago

Also, I don't know whether that is a thing that even makes sense to want, but this limitation means we can never have a field on the root node of a tree.

mathrick commented 3 months ago

I did some more reading, and it turns out I was confused about the meaning of fields. I was under the impression that fields were the properties of nodes, but in reality, they're properties of the relationship between a parent and its children. So the API really doesn't have a concept of "a node's field", only "given child's field".

sogaiu commented 3 months ago

Yes, I found this confusing as well, but IIUC, this is close to my impression of how things work (my impression is vague (^^; ):

in reality, they're properties of the relationship between a parent and its children. So the API really doesn't have a concept of "a node's field", only "given child's field".

ubolonton commented 3 months ago

Yeah, field is basically the nickname by which the parent calls the child at home.

Semi-formally, it's the (name of the) edge between the parent node and the child node.