emacs-tree-sitter / elisp-tree-sitter

Emacs Lisp bindings for tree-sitter
https://emacs-tree-sitter.github.io
MIT License
821 stars 74 forks source link

Upcoming changes to tree-sitter query syntax #38

Closed patrickt closed 4 years ago

patrickt commented 4 years ago

Hey there 👋

I’m filing this issue to make y’all aware that there are some upcoming changes to tree-sitter’s query syntax, stemming from https://github.com/tree-sitter/tree-sitter/pull/615. The implementation of these changes should retain backwards compatibility, but going forward we’d recommend using the new syntax, as it’s (in our opinion!) more readable and powerful. We’ll be updating the highlighting.scm queries in the various tree-sitter language packages, so if you depend directly on them via submodules or the like, everything should Just Work.

If this is going to cause any grief for this project, please reach out on the tree-sitter issues board and let us know, and we’ll do our best to make the transition easy for you.

Thank you very much for all your work on this project: as an Emacs devotee, I personally am hugely excited about the prospect of using tree-sitter to power my Emacs!

cc @maxbrunsfeld

ubolonton commented 4 years ago

Thanks for the heads-up.

The new syntax indeed looks good.

We’ll be updating the highlighting.scm queries in the various tree-sitter language packages, so if you depend directly on them via submodules or the like, everything should Just Work.

If this is going to cause any grief for this project, please reach out on the tree-sitter issues board and let us know, and we’ll do our best to make the transition easy for you.

The current version uses the included queries directly. I'm working on adding quite a lot of modifications though, as I find them a little bit too bare-bone. In any case, it's not a big problem. I'll switch to the new version soon.

Thank you very much for all your work on this project: as an Emacs devotee, I personally am hugely excited about the prospect of using tree-sitter to power my Emacs!

That's great to hear! By the way, syntax highlighting works now. I'm working on the documentation, and touching up the queries.

ubolonton commented 4 years ago

@patrickt There's a small annoyance. It's not deal-breaking, just an inconvenience: # is not a valid read syntax for Lisp symbols, so patterns using predicates must be specified as a string:

(tree-sitter-hl-add-patterns
 "((line_comment) @doc
   (#match? @doc \"^///\"))")

Whereas other patterns can be "embedded" in an Emacs Lisp expression:

(tree-sitter-hl-add-patterns
 [(line_comment) @comment])
maxbrunsfeld commented 4 years ago

Do you imagine that mixing tree queries with emacs lisp (vs having them in their own source file) is a common use case? If so, I'd be open to adding some alternative syntax for the same thing, specifically an alternative "special character" that plays the same role as # for distinguishing predicate names from node names. Is there a character that you'd suggest?

EDIT

Also, there a character that makes more sense from the perspective of "fitting in" with lisps in general (emacs lisp, but also racket, clojure, common lisp, etc)?

shackra commented 4 years ago

Do you imagine that mixing tree queries with emacs lisp (vs having them in their own source file) is a common use case?

Well, I imagine myself extending my Emacs configuration and copying and pasting whatever query I came in the query-builder right into my configuration, this implies that I would like to treat any query as I treat all Lisp code when in Emacs...

ubolonton commented 4 years ago

Do you imagine that mixing tree queries with emacs lisp (vs having them in their own source file) is a common use case? If so, I'd be open to adding some alternative syntax for the same thing, specifically an alternative "special character" that plays the same role as # for distinguishing predicate names from node names.

Emacs Lisp packages likely won't need this. It's more about end-users' ad-hoc customizations. I had something similar to this in my config:

;; For the coding convention that encourages single quotes for strings
;; that are used like symbols.
(add-hook 'python-mode-hook
          (lambda ()
            (tree-sitter-hl-add-patterns
             ((string) @constant
              (match? @constant "^'")))))

Is there a character that you'd suggest?

EDIT

Also, there a character that makes more sense from the perspective of "fitting in" with lisps in general (emacs lisp, but also racket, clojure, common lisp, etc)?

I'm only familiar with Emacs Lisp, Clojure, and a little bit of Common Lisp. Currently I can only think of - (hyphen). It's a convention for "private" functions/variables.

By the way, in terms of "fitting in", field: instead of :field looks a bit weird. It's a minor point though, I think.

maxbrunsfeld commented 4 years ago

Currently I can only think of - (hyphen). It's a convention for "private" functions/variables.

I think the drawback to using a hyphen is that it is also a valid name character, and so it might lead people to believe that the leading hyphen is part of the name. But the current behavior with # is different than that - the # is a piece of special syntax, but the actual name of the predicate (as returned by the Tree-sitter API) does not include the #.

For example, if your query says this:

(
  (identifier) @foo
  (#my-custom-predicate? @foo
)

Then APIs like ts_query_predicates_for_pattern (or in Rust, Query::general_predicates) will return a predicate whose name is my-custom-predicate without the #.

If we used a hyphen instead, you might expect these APIs to return -my-custom-predicate with the hyphen.

It still might be the best option for compatibility with emacs lisp though.

ubolonton commented 4 years ago

I think the drawback to using a hyphen is that it is also a valid name character, and so it might lead people to believe that the leading hyphen is part of the name. But the current behavior with # is different than that - the # is a piece of special syntax, but the actual name of the predicate (as returned by the Tree-sitter API) does not include the #.

Yeah. Now that I remember it, I think a dot is a better fit. In Clojure, it is used for method calls, so . is not considered a part of the name.

((line_comment) @doc
 (.match? @doc "^///"))

In other Lisps, it has no special meaning, but is still a valid start of a symbol.

shackra commented 4 years ago

what's the resolution for this issue?

ubolonton commented 4 years ago

Tree-sitter now supports . for predicates.

The new syntax is used by many tree-sitter-langs's highlighting queries, and is documented from an Emacs Lisp's perspective.