clojure-emacs / clojure-ts-mode

The next generation Clojure major mode for Emacs, powered by TreeSitter
GNU General Public License v3.0
129 stars 11 forks source link

Highlight (some) regular expressions using another grammar #11

Open sogaiu opened 1 year ago

sogaiu commented 1 year ago

I saw the following bit in the emacs-devel archives:

some files may consist of several parts requiring different tree-sitter grammars. For example, a JavaScript file may have its documentation written with jsdoc: JavaScript and jsdoc have a tree-sitter grammar each.

Is there a way to use a tree-sitter grammar in parts of the file and another one in other parts? There could be a main grammar and secondary grammars would be activated on some kinds of nodes of the main one.

Yes, it should be possible, AFAIU. See the node "Multiple Languages" in the ELisp manual, I believe it explains how to do what you want.

As an idea for "somewhere down the line", perhaps it would be interesting to consider the following...

Since tree-sitter-clojure can recognize regex literals, may be one could apply an appropriate regular expression grammar to highlight the portions within the double quotes.

I don't know how close this grammar is to Clojure's flavor of regex, but may be it or some appropriate modification to it (or something that inherits from it) might be used for the task.

For reference, the part of the manual being referred to in the quote above can be see in .texi form here. I didn't manage to find an HTML version. If you've got a recent enough Emacs from the emacs-29 branch, the info may be viewable from within emacs. Worked for me anyway...


Ah sorry. May be I should have made this in the Discussions area?

dannyfreeman commented 1 year ago

Ah sorry. May be I should have made this in the Discussions area

No an issue is fine. I don't even get notifications from discussions lol.

This is a good idea. Clojure uses java flavored regular expressions. I'm not sure how much they are different from that grammar. If it is it might be worth forking and calling it tree-sitter-java-regex if the dialects of regex have enough differences.

sogaiu commented 1 year ago

I don't have the various flavors loaded into my head lately [1].

If I had to guess without looking too closely, I think this is likely to be some JavaScript flavor (or subset of one).

I also don't know / recall whether the various Clojure dialects all support the same regex syntax.

Perhaps this might come in handy eventually.


[1] Mostly working with PEGs in another language ;)

sogaiu commented 1 year ago

Came across this content among Lapce's files:

((regex_lit) @injection.content
 (#set! injection.language "regex"))
dannyfreeman commented 1 year ago

@sogaiu check this out 855cddd124eb4ed9197281fe7f56697902b35cb1

Seems useful for other languages as well. Maybe even belongs in emacs core.

sogaiu commented 1 year ago

Thanks for the heads up!

Hope to take a look soon.

sogaiu commented 1 year ago

Ok, I gave it a try.

I see about capturing #" and ":

clojure-ts-mode-with-regex

sogaiu commented 1 year ago

On a side note, may be it's worth requesting that tree-sitter-regex get added to tree-sitter-module?