Closed manuel-uberti closed 1 year ago
FTR, the same problem appears with elements with vectors and lists.
In my case transposition does occur without issues with the following bits:
I'll see if I can reproduce with bits closer to the original report.
Still no reproduction (transposition happens) using:
Can you try building Emacs master instead of 29, by any chance?
Yes.
That seems to be sufficient to reproduce, I got:
transpose-subr-1: Don't have two things to transpose
I compared the source of transpose-sexps
from Emacs 29.x and 30.y.
29.x:
(defun transpose-sexps (arg &optional interactive)
"Like \\[transpose-chars] (`transpose-chars'), but applies to sexps.
Unlike `transpose-words', point must be between the two sexps and not
in the middle of a sexp to be transposed.
With non-zero prefix arg ARG, effect is to take the sexp before point
and drag it forward past ARG other sexps (backward if ARG is negative).
If ARG is zero, the sexps ending at or after point and at or after mark
are interchanged.
If INTERACTIVE is non-nil, as it is interactively,
report errors as appropriate for this kind of usage."
(interactive "*p\nd")
(if interactive
(condition-case nil
(transpose-sexps arg nil)
(scan-error (user-error "Not between two complete sexps")))
(transpose-subr
(lambda (arg)
;; Here we should try to simulate the behavior of
;; (cons (progn (forward-sexp x) (point))
;; (progn (forward-sexp (- x)) (point)))
;; Except that we don't want to rely on the second forward-sexp
;; putting us back to where we want to be, since forward-sexp-function
;; might do funny things like infix-precedence.
(if (if (> arg 0)
(looking-at "\\sw\\|\\s_")
(and (not (bobp))
(save-excursion
(forward-char -1)
(looking-at "\\sw\\|\\s_"))))
;; Jumping over a symbol. We might be inside it, mind you.
(progn (funcall (if (> arg 0)
'skip-syntax-backward 'skip-syntax-forward)
"w_")
(cons (save-excursion (forward-sexp arg) (point)) (point)))
;; Otherwise, we're between sexps. Take a step back before jumping
;; to make sure we'll obey the same precedence no matter which
;; direction we're going.
(funcall (if (> arg 0) 'skip-syntax-backward 'skip-syntax-forward)
" .")
(cons (save-excursion (forward-sexp arg) (point))
(progn (while (or (forward-comment (if (> arg 0) 1 -1))
(not (zerop (funcall (if (> arg 0)
'skip-syntax-forward
'skip-syntax-backward)
".")))))
(point)))))
arg 'special)))
30.y:
(defun transpose-sexps (arg &optional interactive)
"Like \\[transpose-chars] (`transpose-chars'), but applies to sexps.
Unlike `transpose-words', point must be between the two sexps and not
in the middle of a sexp to be transposed.
With non-zero prefix arg ARG, effect is to take the sexp before point
and drag it forward past ARG other sexps (backward if ARG is negative).
If ARG is zero, the sexps ending at or after point and at or after mark
are interchanged.
If INTERACTIVE is non-nil, as it is interactively,
report errors as appropriate for this kind of usage."
(interactive "*p\nd")
(if interactive
(condition-case nil
(transpose-sexps arg nil)
(scan-error (user-error "Not between two complete sexps")))
(transpose-subr transpose-sexps-function arg 'special)))
Looks like some factoring has happened.
I tried using edebug with both versions and it seems that what is happening in transpose-subr
differs.
Specifically, in 29.x, these lines:
(setq pos1 (funcall aux -1))
(setq pos2 (funcall aux arg))
give non-nil results, while the corresponding lines in 30.y both return nil.
FWIW, aux
appears to be treesit-transpose-sexps
in 30.y, and something that shows #f(compiled-function (arg) #<bytecode 0x.....>)
in 29.x.
So far this doesn't seem like a bug in clojure-ts-mode, but who knows what further investigation might yield :)
Well, thanks a lot really for going through all this. It may be worth reporting an Emacs bug, so I'll try to do that and point to this discussion.
I was mistaken in my last report -- aux
was not treesit-transpose-sexps
for 29.x -- that doesn't seem to exist in Emacs 29.x. My bad.
I've amended that above in my previous comment.
I tried stepping through treesit-transpose-sexps
and the vague sense I get is that it may be assuming things that don't hold for some parts of tree-sitter-clojure (the tree-sitter grammar that clojure-ts-mode currently uses).
Note that transposition does seem to work for integers.
Using tree-sitter-clojure's terms, transposition works for num_lit
(numbers), but not kwd_lit
.
Currently, kwd_lit
is defined this way:
kwd_lit: $ =>
choice($._kwd_leading_slash,
$._kwd_just_slash,
$._kwd_qualified,
$._kwd_unqualified),
// (namespace :/usr/bin/env) ; => ""
// (name :/usr/bin/env) ; => "usr/bin/env"
_kwd_leading_slash: $ =>
seq(field('marker', $._kwd_marker),
field('delimiter', NS_DELIMITER),
field('name', alias(KEYWORD_NAMESPACED_BODY, $.kwd_name))),
// (namespace :/) ;=> nil
// (name :/) ;=> "/"
_kwd_just_slash: $ =>
seq(field('marker', $._kwd_marker),
field('name', alias(NS_DELIMITER, $.kwd_name))),
_kwd_qualified: $ =>
prec(2, seq(field('marker', $._kwd_marker),
field('namespace', alias(KEYWORD_NO_SIGIL, $.kwd_ns)),
field('delimiter', NS_DELIMITER),
field('name', alias(KEYWORD_NAMESPACED_BODY, $.kwd_name)))),
_kwd_unqualified: $ =>
prec(1, seq(field('marker', $._kwd_marker),
field('name', alias(KEYWORD_NO_SIGIL, $.kwd_name)))),
_kwd_marker: $ =>
choice(KEYWORD_MARK, AUTO_RESOLVE_MARK),
The definition I currently see for treesit-transpose-sexps
is:
1835 (defun treesit-transpose-sexps (&optional arg)
1836 "Tree-sitter `transpose-sexps' function.
1837 ARG is the same as in `transpose-sexps'.
1838
1839 Locate the node closest to POINT, and transpose that node with
1840 its sibling node ARG nodes away.
1841
1842 Return a pair of positions as described by
1843 `transpose-sexps-function' for use in `transpose-subr' and
1844 friends."
1845 (let* ((parent (treesit-node-parent (treesit-node-at (point))))
1846 (child (treesit-node-child parent 0 t)))
1847 (named-let loop ((prev child)
1848 (next (treesit-node-next-sibling child t)))
1849 (when (and prev next)
1850 (if (< (point) (treesit-node-end next))
1851 (if (= arg -1)
1852 (cons (treesit-node-start prev)
1853 (treesit-node-end prev))
1854 (when-let ((n (treesit-node-child
1855 parent (+ arg (treesit-node-index prev t)) t)))
1856 (cons (treesit-node-end n)
1857 (treesit-node-start n))))
1858 (loop (treesit-node-next-sibling prev t)
1859 (treesit-node-next-sibling next t)))))))
On line 1845, (treesit-node-at (point))
yields a node for kwd_name
, and its parent is kwd_lit
.
On line 1846, (treesit-node-child parent 0 t)
yields kwd_name
.
This logic is not likely to work with tree-sitter-clojure as it is currently implemented.
I think what one likely wants (if possible) is to identify kwd_lit
at point and then map_lit
as the parent.
Below is an invocation and output of running tree-sitter parse
on the sample buffer content:
$ tree-sitter parse samples/transpose-keywords.clj
(source [0, 0] - [1, 0]
(map_lit [0, 0] - [0, 7]
value: (kwd_lit [0, 1] - [0, 3]
name: (kwd_name [0, 2] - [0, 3]))
value: (kwd_lit [0, 4] - [0, 6]
name: (kwd_name [0, 5] - [0, 6]))))
It's very likely that transposition will not work with symbols either. Yes, I just checked and the same thing happens there too.
I see, thanks again. IIUC, it does not make sense to report this as an Emacs bug then.
Put otherwise, is there something to adjust in tree-sitter-clojure?
I am not sure about it not being an Emacs bug yet actually.
I'm not sure the treesit-transpose-sexps
implementation is reasonable. It seems to be assuming that (treesit-node-at (point))
will yield an appropriate node for what it wants to accomplish. For tree-sitter-clojure's case, you might say it is looking too deeply when trying to handle keywords and symbols.
IIUC, there are over 300 tree-sitter grammars and I wonder whether tree-sitter-clojure is the only one that has this sort of characteristic.
The reason keywords and symbols are defined the way they are is to support the ability to access "structure" that both keywords and symbols have in Clojure (the namespace part and the name part). I'm not sure what kind of adjustment could (or should) be made on tree-sitter-clojure's side, but there hasn't been much time for thoughts to arise :)
May be @dannyfreeman will have some relevant thoughts.
Sorry if I am being too pushy, this is not a blocking/urgent problem by all means. I've been a happy user of this mode for a while, this won't change my opinion. :)
No worries!
Thanks for opening this issue -- I think it's often better to learn of these things earlier rather than later :)
I do not believe this would be an issue if tree-sitter would have properly parsed keyword names and namespaces into fields of anonymous nodes instead of forcing us to resort to named nodes. Either way here we are. Funny enough you can transpose keyword namespace and names by running transpose-sexps in the middle of a namespaced keyword :my-ns|/the-name
where |
is the cursor.
I'm not sure we can fix this on our end without dropping some features in the grammar, but we can use the original transpose-sexp function while tree-sitter is active.
@manuel-uberti do you mind trying out the latest change on the main branch and see if that fixes your issue?
I tried the changes with Emacs 30.y (88bb7cdf) and they worked fine here.
I think it might be worth bringing this up with the Emacs devs -- at least @casouri. With the number of grammars in existence, it doesn't seem unlikely to me that others might encounter this issue.
I don't know if it's a good idea, but possibly each *-ts-mode
could enumerate which nodes don't work so well with treesit-transpose-sexps
's current algorithm in some manner (i.e. kwd_name
is too deep, kwd_lit
is the desired node) and this information could be made use of by treesit-transpose-sexps
to adjust.
It might make sense to do something like the above (enumerate "exceptions") for the sake of other structural editing operations as well as I don't know that transposition is going to be the only thing affected by:
if tree-sitter would have properly parsed keyword names and namespaces into fields of anonymous nodes instead of forcing us to resort to named nodes
I wonder if the types of changes we've been discussing here about string and regex nodes having internal "structure" would also lead to similar issues...
From conversations with the Pulsar dev in that issue (or elsewhere?) I got the impression that they'd been used to string delimiters not being part of a string token. May be it was in this comment. I guess such grammars won't necessarily exhibit this "problem" if they don't use fields, e.g. this definition of a string doesn't.
I wonder if the types of changes we've been discussing here about string and regex nodes having internal "structure" would also lead to similar issues...
Yes they probably would. Maybe looking into an offset mechanism in emacs is a better solution.
I also think it would be worth brining this transpose issue to the emacs devs. I may do so soon, but need to take a little break from the mailing list first.
I didn't write treesit-transpose-sexps
, but I've got some good news. Theo (@theothornhill) added treesit-sexp-type-regexp
in Emacs 30 which allows some customization, but I guess treesit-transpose-sexps
wasn't updated to take advantage of it, or there was some other reason. Not long ago I added treesit-node-match-p
and treesit-thing-settings
and friends, which allows one to flexibly define "things", so this should be possible:
I don't know if it's a good idea, but possibly each *-ts-mode could enumerate which nodes don't work so well with treesit-transpose-sexps's current algorithm in some manner (i.e. kwd_name is too deep, kwd_lit is the desired node) and this information could be made use of by treesit-transpose-sexps to adjust.
I think what I'll do is retire treesit-sexp-type-regexp
(since treesit-thing-settings
can replace it), and make treesit-forward/backward-sexp
and treesit-transpose-sexps
take advantage of treesit-thing-settings
. Then you can see if that'll allow you to make a better transpose-sexp experience.
I also think it would be worth brining this transpose issue to the emacs devs. I may do so soon, but need to take a little break from the mailing list first.
You have my condolence. FWIW, I think the people who actually put in the time and effort to maintain a project obviously have the right to decide what works for them and how they want to work. It's the same for Emacs devs who don't want to move to a forge until all their needs are met; and the same for Clojure-mode devs who want to maintain Clojure-mode outside of Emacs and not ask for assignments.
I was going to join the discussion with my 2c, but the further down I read the thread, the more my motivation dwindles ;-)
To give some context, there was some discussion about supporting extending structural movement/editing functions with tree-sitter, which includes forward/backward-sexp
and transpose-sexp
, and perhaps a couple more. For Emacs 29, I managed to get movement function for defuns implemented. But the rest has to wait for Emacs 30.
The problem was what does sexp means in a tree-sitter major mode. The idea of Stefan and Theo was that the sexp would be determined by the position of point. Eg, something like "the largest/smallest node before/after point".
Personally I'm more fond of defining sexps as "repeatable nodes", ie, any node that can be repeated in the grammar. So things like defun, arguments in an argument, statements, list elements, etc, are all considered sexps. This is more straightforward (ie, lower cognitive load) and useful for movement, IMO.
But my idea is perhaps more suitable for forward/backward-list
. Anyway, we went with sexp = syntax construct at point.
I hope this explains the particular way treesit-transpose-sexps
is implemented. Theo can probably explain it better since he wrote it.
@manuel-uberti do you mind trying out the latest change on the main branch and see if that fixes your issue?
It works as expected now, thanks a lot.
think what I'll do is retire
treesit-sexp-type-regexp
(sincetreesit-thing-settings
can replace it), and maketreesit-forward/backward-sexp
andtreesit-transpose-sexps
take advantage oftreesit-thing-settings
. Then you can see if that'll allow you to make a better transpose-sexp experience.
Thanks for chiming in with this. I'll try to keep an eye out for those changes and see about filling out the treesit-thing-settings
in the meantime.
I was going to join the discussion with my 2c, but the further down I read the thread, the more my motivation dwindles ;-)
I appreciate the kind words here as well, and can't blame you for abstaining lol.
@manuel-uberti
It works as expected now, thanks a lot.
Glad to be of service. A new version is pushed up and should be available now. With that I will close this ticket.
Thanks for chiming in with this. I'll try to keep an eye out for those changes and see about filling out the treesit-thing-settings in the meantime.
I just pushed a change to master. Now treesit-transpose-sexps
should work with treesit-thing-settings
. It should find the sexp at point and transpose it with its siblings.
@casouri Thanks for the heads up.
@dannyfreeman For reference, here are some treesit-thing-settings
examples I came across:
sentence
sexp
text
sentence
, sexp
, text
@casouri Thank you for doing that so quickly. I'll try it out and let you know if I have any trouble.
@sogaiu You are a research machine, these examples will come in handy. Thank you
I'll try get these worked out either tonight or tomorrow when I can find some free time.
I'm not quite sure what would be considered a "sentence" in clojure-ts-mode. I think most things with the exception of ;
style comments and sub nodes of keywords/symbols would be s-expressions. I'm guessing that power things like forward/backward-sentence. If I can match the behavior of clojure-mode that should be sufficient. They currently work the same out of the box.
Here are recent docs for treesit-thing-settings
:
DEFVAR_LISP ("treesit-thing-settings",
Vtreesit_thing_settings,
doc:
/* A list defining things.
The value should be an alist of (LANGUAGE . DEFINITIONS), where
LANGUAGE is a language symbol, and DEFINITIONS is a list of
(THING PRED)
THING is a symbol representing the thing, like `defun', `sexp', or
`sentence'; PRED defines what kind of node can be qualified as THING.
PRED can be a regexp string that matches the type of the node; it can
be a predicate function that takes the node as the sole argument and
returns t if the node is the thing, and nil otherwise; it can be a
cons (REGEXP . FN), which is a combination of a regexp and a predicate
function, and the node has to match both to qualify as the thing.
PRED can also be recursively defined. It can be (or PRED...), meaning
satisfying anyone of the inner PREDs qualifies the node; or (not
PRED), meaning not satisfying the inner PRED qualifies the node.
Finally, PRED can refer to other THINGs defined in this list by using
the symbol of that THING. For example, (or sexp sentence). */);
FWIW, I didn't find any examples where PRED
was a predicate function and my brief tinkering didn't yield useful results (likely pilot error and lack of understanding (^^; ).
Update: Oh wait, I guess the ruby one does sort of have something that's of the form (REGEXP . FN)
:
(defun ruby-ts--sexp-p (node)
;; Skip parenless calls (implicit parens are both non-obvious to the
;; user, and might take over when we want to just over some physical
;; parens/braces).
(or (not (equal (treesit-node-type node)
"argument_list"))
(equal (treesit-node-type (treesit-node-child node 0))
"(")))
;; ...
(setq-local treesit-thing-settings
`((ruby
(sexp ,(cons (rx
bol
(or
"class"
"module"
;; ...
"instance_variable"
"global_variable"
)
eol)
#'ruby-ts--sexp-p)))))
But may be that kind of thing is not needed for the current case.
Was also puzzled about what should count as a sentence
, but I thought one way to investigate that might be to look for how other modes might behave with treesit-forward-sentence
.
On a side note, I applied the wisdom from this post to the emacs-devel mailing list to have multiple versions of Emacs available for testing. That in combination with the new --init-directory
option in 29.x and a few other bits has been helpful. May be there are better ways...
@casouri I tried a variety of things without much success (see below for my best guess).
I also tried with some JavaScript:
var a = [true, false];
// ^
// |
// cursor
Using js-ts-mode
, the result of C-M-t
was:
var [true, false] = a;
When not using js-ts-mode
, the expected result appeared:
var a = [false, true];
So may be it's possible there's something not quite right?
I think the emacs I tried was from the master branch at commit 781c0393.
What I thought might work for clojure-ts-mode
was:
diff --git a/clojure-ts-mode.el b/clojure-ts-mode.el
index 2fc9a4a..f235f5c 100644
--- a/clojure-ts-mode.el
+++ b/clojure-ts-mode.el
@@ -618,7 +618,35 @@ See `clojure-ts--standard-definition-node-name' for the implementation used.")
treesit-font-lock-feature-list
'((comment string char number)
(keyword constant symbol bracket builtin)
- (deref quote metadata definition variable type doc regex tagged-literals)))
+ (deref quote metadata definition variable type doc regex tagged-literals))
+ treesit-thing-settings
+ `((clojure
+ (sexp ,(regexp-opt '("#_"
+ "num_lit"
+ "kwd_lit"
+ "str_lit"
+ "char_lit"
+ "nil_lit"
+ "bool_lit"
+ "sym_lit"
+ "list_lit"
+ "map_lit"
+ "vec_lit"
+ "set_lit"
+ "anon_fn_lit"
+ "regex_lit"
+ "read_cond_lit"
+ "splicing_read_cond_lit"
+ "ns_map_lit"
+ "var_quoting_lit"
+ "sym_val_lit"
+ "evaling_lit"
+ "tagged_or_ctor_lit"
+ "derefing_lit"
+ "quoting_lit"
+ "syn_quoting_lit"
+ "unquote_splicing_lit"
+ "unquoting_lit"))))))
(when clojure-ts--debug
(setq-local treesit--indent-verbose t
treesit--font-lock-verbose t)
@dannyfreeman On a side note, I included #_
, because IIUC, it's a convenient thing to be able to toggle between two parameters, e.g. between:
(my-nice-fn left-alt #_ right-alt)
(my-nice-fn #_ left-alt right-alt)
I've pushed a new commit to main that re-enables treesit-transpose-sexp
and takes advantage of treesit-thing-settings.
Seems to work well. @sogaiu your attempt was missing a layer of parens
`((clojure
( ;; right here
(sexp ;; ...
This still solves the @manuel-uberti reported. If you feel up to trying again with the latest main branch here that would be appreciated. It will need the latest Emacs master branch that @casouri released the new treesit-transpose-sexps on.
@sogaiu says
a side note, I included #_, because IIUC, it's a convenient thing to be able to toggle between two parameters, e.g. between:
This trick is so cool. I included it in the change I pushed.
Something that might be useful for this mode and other modes is some notion of precedence when identifying a treesit-thing. What I am about to describe below also does not work in clojure-mode, I would consider this icing on an already sweet cake
Example:
The following is a str_lit
, and we consider a s-expression node.
"f4abb408-02a3-4289-9ef7-f0c5b69221a0"
The next example is a tagged_or_ctor_list
#uuid "f4abb408-02a3-4289-9ef7-f0c5b69221a0"
which has a tree that looks like
(tagged_or_ctor_lit #
tag: (sym_lit name: (sym_name))
(str_lit))
which has a sub-tree of str_lit.
If I call transpose-sexp
#uuid "asdf"
| ;; My cursor
'a-symbol
then the result is this
#uuid 'a-symbol
| ;; My cursor
"asdf"
Where a would expect
"asdf"
| ;; My cursor
#uuid 'a-symbol
But we're matching the smallest nodes around the point as s-expressions here, which is just the string and the symbol, not the entire tagged_or_ctor_list
. I think this might be solved by defining some sort of precedence when matching a thing node. or
with multiple regexs does not do that. Not sure if I can suggest good ergonomics right now for defining precedence, but I will think on it.
missing a layer of parens
`((clojure
( ;; right here
(sexp ;; ...
Puzzling because AFAICT the samples I found don't seem to have a paren there...
For example in ruby-ts-mode.el:
(setq-local treesit-thing-settings
`((ruby
(sexp ,(cons (rx
or in c-ts-mode.el:
(setq-local treesit-thing-settings
`((c
;; It's more useful to include semicolons as sexp so
;; that users can move to the end of a statement.
(sexp (not ,(rx (or "{" "}" "[" "]" "(" ")" ","))))
or in js.el:
(setq-local treesit-thing-settings
`((javascript
(sexp ,(regexp-opt js--treesit-sexp-nodes)))
The docstring I looked at had:
The value should be an alist of (LANGUAGE . DEFINITIONS), where
LANGUAGE is a language symbol, and DEFINITIONS is a list of
(THING PRED)
With (LANGUAGE . DEFINITIONS)
being a cons, isn't:
(LANGUAGE . ((THING PRED) (THING PRED) ...))
the same as:
(LANGUAGE (THING PRED) (THING PRED) ...)
?
May be there's some disconnect among the mode code, the docstring, and the implementation...or may be I've just missed something obvious (^^;
Will need to digest the precedence stuff. Seems potentially tricky.
I think I may have learned the transposition of #_
trick for argument toggling from @PEZ when doing a small amount of work on calva, so can't claim any credit there :)
Most likely Danny didn't set treesit-thing-settings
correctly and Emacs falls back to using the default sexp function, which observes the same behavior as clojure-mode
;-) I need to look at the definition of treesit-transpose-sexps
. It seems that simply adding treesit-thing-settings
support to it doesn't solve the problem. I might need to change its logic or rewrite it.
@casouri Thanks for taking a look and your thoughts.
Just to clarify, do you observe the same behavior for the JavaScript example mentioned here? Repeating below for convenience:
var a = [true, false];
// ^
// |
// cursor
Using js-ts-mode, the result of C-M-t was:
var [true, false] = a;
When not using js-ts-mode, the expected result appeared:
var a = [false, true];
@dannyfreeman
This still solves the @manuel-uberti reported. If you feel up to trying again with the latest main branch here that would be appreciated. It will need the latest Emacs master branch that @casouri released the new treesit-transpose-sexps on.
Yep, I can confirm it still works as expected. (Emacs built on f08684ab
and latest clojure-ts-mode
).
@manuel-uberti Note this potential caveat though.
Yeah I'll wait again to fix. Even though my thing-settings aren't really working, I'm just going to leave them since they have the same effect as reverting to the transpose function for now.
-- Danny Freeman
Expected behavior
Given a
.clj
file with the following content:With point between
:a
and:b
:If I hit C-M-t, I get:
Just like
clojure-mode
behaves.Actual behavior
Given the previous scenario, when I hit C-M-t, I get the message:
Steps to reproduce the problem
Provided
clojure-ts-mode
is installed (I installed via NonGNU ELPA):emacs -Q
{:a :b}
:a
and:b
Environment & Version information
clojure-ts-mode version
Emacs version
GNU Emacs 30.0.50 (build 1, x86_64-pc-linux-gnu, cairo version 1.16.0) of 2023-08-29
Operating system
Ubuntu 22.04.3