Open Xophmeister opened 1 year ago
It is really interesting. In a sense, we need to tag a recurring pattern (something like “an identifier followed by a comma; repeated”, and tag parts of it as being keys to be sorted on (the identifier in our example). I don't understand the query language well enough to understand how possible this sort of idea is.
I decided to write a POC for sorting nodes. My working example is sorting key/value pairs in json objects, according to the alphabetical order on keys. This is the kind of query I'm thinking about:
(object
(pair
. (string) @sort_key
) @begin_scope
.
"," @end_scope
(#scope_id! "sort_item")
) @sort_asc
It creates a scope called sort_item
around all consecutive (pair) . ","
in object
nodes.
In each sort_item
scope, there is one node tagged with sort_key
, whose textual content will be used as a sorting key. Here I chose the first string
of the pair
(i.e. the key).
Each object
is tagged with sort_asc
.
The semantics I want to enforce is the following:
sort_asc
(resp. sort_desc
), extract all scopes named sort_item
in its children.sort_key
, while preserving the order of nodes inside the scopes.Even if I succeed in implementing this, I'm not quite satisfied with the fact that I hardcode sort_item
as a special scope_id
value, but all the alternatives I considered are worse.
Do you have opinions or suggestions on the subject?
In your JSON example, the last item in the object will not be considered because it necessarily doesn't have a ,
delimiter and the query won't match.
While I'm satisfied with how the reordering queries should look like (accounting for @Xophmeister's remark above), I think the actual implementation would be too tricky at the moment.
We are a bit too eager in collecting the leaves of the syntax tree, and putting them into a flat vector. If we want to implement sorting predicates right now, we would need to preprocess the Atom
vector in a way that would basically recreate the syntax tree.
I think a more clever way to do it would be to keep the syntax tree as it is, apply all queries, including appending, prepending, and sorting, to the tree itself, then flattening it after all the work is done.
However, this needs a large refactoring, and we're not sure yet that there is a real need for topiary to be able to sort code fragments, at least in a way that would be feasible with the proposed queries. Therefore, I suggest we stop our efforts towards sorting queries for the moment.
I'm hoping to use tweag to write a JSONC formatter for my VS Code settings file that:
So I could format something like this:
{
// b comment
"b": "value,
// a comment
"a": "value"
}
Into:
{
// a comment
"a": "value,
// b comment
"b": "value"
}
This would help me keep my various VSCode settings easy to merge with each other
Is this a good issue to follow for info on when/if tweag will support something like this?
That's an interesting use case.
The way @nbacquey describes a potential solution (above) could allow you to achieve this, if the sort_item
scope were engineered to include the comment. If this feature is implemented, that would probably be enough and it would work for you. That said, such a rule would be too prescriptive for a general JSONC formatter, without changes to the way comments are processed. (I only mention this because that may influence the priority of this issue...)
Is your feature request related to a problem? Please describe. It's common for groups of sibling nodes to be expected in a canonical order. For example:
It would be useful if Topiary could support node reordering to support this. (This may not be realistic; see additional context, below.)
Describe the solution you'd like Perhaps ordering capture names, like
@asc
and@desc
, which can be applied to parent nodes to act on their children. Then order constraint captures, like@first
or@last
, etc., which can be applied to specific child patterns.Additional context This is probably out of scope for Topiary, as -- at the very least -- it depends on the grammar behaving properly. For example, anecdotally delimiters in such groups are often anonymous sibling nodes, but not necessarily. It would be hard to generalise this.
Groups of related nodes may not even come under a suitable parent node. Imports, for example, are often top-level statements. (Maybe scopes could solve this.)
Ordering naively might also change the semantics. For example, non-fully qualified pattern matches often need to have their order preserved for the appropriate behaviour; it would thus be up to the grammar to determine full qualification of all children, which again seems intractable.
Various other nuances (e.g., collation, sort order, constraint solving)...