Node reordering - Githubissues

Xophmeister commented 1 year ago

Is your feature request related to a problem? Please describe. It's common for groups of sibling nodes to be expected in a canonical order. For example:

Imports; alphabetical
Fully qualified pattern matches; alphabetical
Annotations (e.g., in Nickel: type first, default last)

It would be useful if Topiary could support node reordering to support this. (This may not be realistic; see additional context, below.)

Describe the solution you'd like Perhaps ordering capture names, like @asc and @desc, which can be applied to parent nodes to act on their children. Then order constraint captures, like @first or @last, etc., which can be applied to specific child patterns.

Additional context This is probably out of scope for Topiary, as -- at the very least -- it depends on the grammar behaving properly. For example, anecdotally delimiters in such groups are often anonymous sibling nodes, but not necessarily. It would be hard to generalise this.

Groups of related nodes may not even come under a suitable parent node. Imports, for example, are often top-level statements. (Maybe scopes could solve this.)

Ordering naively might also change the semantics. For example, non-fully qualified pattern matches often need to have their order preserved for the appropriate behaviour; it would thus be up to the grammar to determine full qualification of all children, which again seems intractable.

Various other nuances (e.g., collation, sort order, constraint solving)...

aspiwack commented 1 year ago

It is really interesting. In a sense, we need to tag a recurring pattern (something like “an identifier followed by a comma; repeated”, and tag parts of it as being keys to be sorted on (the identifier in our example). I don't understand the query language well enough to understand how possible this sort of idea is.

nbacquey commented 1 year ago

I decided to write a POC for sorting nodes. My working example is sorting key/value pairs in json objects, according to the alphabetical order on keys. This is the kind of query I'm thinking about:

(object
  (pair
    . (string) @sort_key
  ) @begin_scope
  .
  "," @end_scope
  (#scope_id! "sort_item")
) @sort_asc

It creates a scope called sort_item around all consecutive (pair) . "," in object nodes. In each sort_item scope, there is one node tagged with sort_key, whose textual content will be used as a sorting key. Here I chose the first string of the pair (i.e. the key). Each object is tagged with sort_asc.

The semantics I want to enforce is the following:

For each node tagged with sort_asc (resp. sort_desc), extract all scopes named sort_item in its children.
Reorder the scopes in ascending (resp. descending) order based on the textual value of the node tagged with sort_key, while preserving the order of nodes inside the scopes.
Reinsert the list of sorted scopes in the parent node, at the position where the first scope was.

Even if I succeed in implementing this, I'm not quite satisfied with the fact that I hardcode sort_item as a special scope_id value, but all the alternatives I considered are worse.

Do you have opinions or suggestions on the subject?

Xophmeister commented 1 year ago

In your JSON example, the last item in the object will not be considered because it necessarily doesn't have a , delimiter and the query won't match.

nbacquey commented 1 year ago

While I'm satisfied with how the reordering queries should look like (accounting for @Xophmeister's remark above), I think the actual implementation would be too tricky at the moment. We are a bit too eager in collecting the leaves of the syntax tree, and putting them into a flat vector. If we want to implement sorting predicates right now, we would need to preprocess the Atom vector in a way that would basically recreate the syntax tree.

I think a more clever way to do it would be to keep the syntax tree as it is, apply all queries, including appending, prepending, and sorting, to the tree itself, then flattening it after all the work is done.

However, this needs a large refactoring, and we're not sure yet that there is a real need for topiary to be able to sort code fragments, at least in a way that would be feasible with the proposed queries. Therefore, I suggest we stop our efforts towards sorting queries for the moment.

torhovland commented 1 year ago

bbkane commented 1 year ago

I'm hoping to use tweag to write a JSONC formatter for my VS Code settings file that:

sorts subobjects by key
also drag comments along that "belong" to each subobject

So I could format something like this:

{
  // b comment
  "b": "value,
  // a comment
  "a": "value"
}

Into:

{
  // a comment
  "a": "value,
  // b comment
  "b": "value"
}

This would help me keep my various VSCode settings easy to merge with each other

Is this a good issue to follow for info on when/if tweag will support something like this?

Xophmeister commented 1 year ago

That's an interesting use case.

The way @nbacquey describes a potential solution (above) could allow you to achieve this, if the sort_item scope were engineered to include the comment. If this feature is implemented, that would probably be enough and it would work for you. That said, such a rule would be too prescriptive for a general JSONC formatter, without changes to the way comments are processed. (I only mention this because that may influence the priority of this issue...)

tweag / topiary

Node reordering #351