clj-commons / rewrite-clj

Rewrite Clojure code and edn
https://cljdoc.org/d/rewrite-clj
MIT License
590 stars 55 forks source link

consider allowing metadata be be a child #115

Open lread opened 3 years ago

lread commented 3 years ago

Originally raised by as https://github.com/lread/rewrite-cljc-playground/issues/2

From discussions with @sogaiu and @borkdude on #slack

When traversing parsed data, the fact that metadata holds the object it is describing can be awkward.

Example from @sogaiu

(defn ns?
  [zloc]
  (when (rz/list? zloc)
    (when-let [head (rz/down zloc)]
      (let [tag (rz/tag head)
            find-ns-name (fn [start-zloc]
                           (let [rzloc (rz/right start-zloc)
                                 r-tag (rz/tag rzloc)]
                             (cond (= r-tag :token)
                                   rzloc
                                   ;;
                                   (= r-tag :meta)
                                   (-> rzloc
                                     rz/down
                                     rz/right)
                                   ;;
                                   :else
                                   nil)))]
        (cond (= tag :token)
              (when (= (rz/string head) "ns")
                (find-ns-name head))
              ;;
              (= tag :meta)
              (let [target (-> head
                             rz/down
                             rz/right)]
                (when (= (rz/string target) "ns")
                  (find-ns-name head)))
              ;;
              :else
              nil)))))

Could we allow for metadata to be the child of node instead of the parent node while preserving API compatibility?

lread commented 3 years ago

Issue comments from rewrite-cljc-playground:

@borkdude:

@lread In clj-kondo I've opted to include metadata nodes (there can be multiple!) as a :meta key-value on the node itself.

@lread:

As of this writing (3-Apr-2020), folks on Slack are discussing that default rewrite-clj handling of metadata can be awkward.

@borkdude chimed in:

yep, something I found annoying in clj-kondo with rewrite-clj as well. I changed the vendored rewrite-clj implementation

And @sogaiu added:

the current tree-sitter-clojure grammar implementation puts metadata inside the things they are supposed to be attached to because i found working with rewrite-clj cumbersome wrt metadata

And also offered an example from his work on tree-sitter-clojure.

@sogaiu:

Would very much be interested in discussing how certain things might be done in general (whether it's in a tree-sitter grammar or elsewhere).

Hope you don't mind if I jot down some thoughts here. If you're not interested / motivated / don't have time / etc. please feel free to ignore :) I'm also happy to delete this / move it elsewhere.

Anyway, one item that I'm up in the air about is the handling of tagged literals. They are currently specified as:

https://github.com/sogaiu/tree-sitter-clojure/blob/adb8e0ebbf7490f96270aa93af8a1ce7f5e8b4d8/grammar.js#L434_L449

There are a couple things that might be relevant:

1) To get metadata and discard forms to work, they are pervasive. Many rules include them as I didn't figure out how to remove that duplication (current hypothesis is that it's a tree-sitter limitation).

2) Tagged literals are "captured" by creating essentially a sequence of tag names followed by some value (ignoring metadata and discard forms for a moment).

I'm mentioning point 1 as my current impression is that there are technical limitations that constrained what's possible with tree-sitter which don't necessarily apply for rewrite-clj* (This point is mentioned just to give some background.)

The more relevant point is the second one. I think one could have done some kind of nested capture instead -- that seems logically more like how tagged literals are actually handled / processed in Clojure. It seemed like that might just add extra work for anyone using a tree though so I didn't go for it -- but there might be a downside I'm not seeing.

If you have any thoughts about this please share :)

@lread:

Thanks for this @sogaiu, I don't think I'll comment on this soon, but appreciate your thoughts and research being here to review!

@sogaiu:

Thanks for the response and good to hear from you :)

respatialized commented 2 months ago

I am interested in this question; I frequently encountered some challenges with this when working on Adorn. Using defn as an example here:

(require '[rewrite-clj.parser :as parser]
              '[rewrite-clj.node :as node])

(node/coerce 
  '(defn ^{:doc "Custom square root" :pre #(> % 0) :post #(> % 0)}
      [^Number x]
      (Math/sqrt x)))

For both var metadata and function parameters, the resulting symbol - the most semantically significant element - ends up as a child node of the metadata attached to it, which requires rewriting if we want to bring the symbol back up to "top level" for sequential display within the parent node.

I want to also cast my vote in favor of the ability to return something like a :meta key. This would allow the Node to be "top level" without breaking the close link between a node and the metadata associated with it.

I think it may be a more intuitive solution than making metadata a child node - if you have a map or vector with metadata attached to it, how would you distinguish between the child nodes that are the metadata and the child nodes that are the actual enclosed elements of the collection?

lread commented 2 months ago

Even though this is an old issue, it is still an interesting topic! Thanks for piping in, @respatialized!

The significance of metadata when parsing Clojure source depends on what you are doing. For something like a code formatter like cljfmt or zprint, I'm guessing that it is as significant as any other element. For these use cases, the current implementation seems, at least, OK.

But for other use cases, metadata will be noise or overly prominent.

If we included an option to parse metadata under a node :meta key, this would mean it would be skipped during normal node traversal and would have to be explicitly queried via the :meta key. Which, I think you are saying, would be desirable for your use case, and is what borkdude chose to do for his inlined version of rewrite-clj in clj-kondo.