Supporting different schemas around getting child nodes

LeaVerou commented 6 months ago

Problem

In #16 we converted our hardcoded { node, property, index } parent pointer structure to a more flexible { node, path[] } schema. However, we did not change our internals much to really support arbitrary paths. We still assume children are found by:

If no getChildProperties() is specified, we follow all properties on a node, and call isNode() on them to see if they are child nodes. This only goes 1-2 levels deep: values that are obtained by following one property, or array values if that one property is an array.
If getChildProperties() is specified, we follow these properties and do not call isNode(), we just filter by existence.

Currently, this assumes a specific structure that may be overfit to Treecle’s AST beginnings. I suspect in the wild there are two common ways to represent tree data structures:

Follow specific properties that always point to children (the AST case)
Follow a property that always points to children (children for Mavo Nodes, childNodes for DOM Nodes)

Currently, the API only supports providing a function that takes a node and returns a list of properties. As an example, this is how this setting is specified in vastly:

export const properties = {
    CallExpression: ["arguments", "callee"],
    BinaryExpression: ["left", "right"],
    UnaryExpression: ["argument"],
    ArrayExpression: ["elements"],
    ConditionalExpression: ["test", "consequent", "alternate"],
    MemberExpression: ["object", "property"],
    Compound: ["body"],
};

defaults.getChildProperties = (node) => {
    return properties[node.type] ?? [];
};

This appears to be overfit to 1. Yet, I suspect 2 may even be more common. How do we allow both to be specified without making either more complicated due to the existence of the other?

Ideas

`getChildProperties()` to `getChildPaths()`, handle both `string[][]` and `string[]`?

We don't want to complicate 1 to cater to 2, but what if we could do both? If the function returns an array of strings, they are single properties. If it returns an array of arrays, they are paths.

The problem is, we don’t necessarily have specific child properties in 2, often once you get from the node to its children, everything in that data structure is a child.

Wildcards? JSON Paths?

Basically, we want a way to say children/* for these cases. What if we handle / and * in properties specially?

But then we’re basically creating a path microsyntax, and restricting the potential syntax of actual real properties accordingly. OTOH, that's basically JSON Path syntax, which is quite well established.

The advantage of something like this is that we can still handle properties like Vastly’s in exactly the same way.

Not a huge fan of any of these ideas, so I’ll continue brainstorming.

adamjanicki2 commented 6 months ago

Could also be a variant of option 1 where we accept a generic flattened array, e.g. ["left", "right"], or a nested array, e.g. [["children", "key1"], ["children", "key2"]], or we could also allow the wildcard key to signal check all properties in this object, for example, [["children", "*"]] would mean all subproperties of children are children

LeaVerou commented 6 months ago

Could also be a variant of option 1 where we accept a generic flattened array, e.g. ["left", "right"], or a nested array, e.g. [["children", "key1"], ["children", "key2"]], or we could also allow the wildcard key to signal check all properties in this object, for example, [["children", "*"]] would mean all subproperties of children are children

Yeah, the more I think about it, the more I like this idea. For convenience, we should also support arrays as the value, not just functions.

E.g. the Mavo.Node use case would be:

childPaths: [["children", "*"]]

the DOM Node use case would be:

childPaths: [["childNodes", "*"]]

While having a nested array with a single element is a bit unwieldy, it's very explicit, and in most cases there's only one.

adamjanicki2 commented 6 months ago

Yeah, the more I think about it, the more I like this idea. For convenience, we should also support arrays as the value, not just functions.

Yeah we can definitely do this. I'll start iterating on this and have a PR up today or tomorrow

LeaVerou commented 6 months ago

Btw I think the function that applies such a path to an object and returns the result is really useful and we should expose it as one of our helpers rather than having it as an internal util.

adamjanicki2 commented 6 months ago

Btw I think the function that applies such a path to an object and returns the result is really useful and we should expose it as one of our helpers rather than having it as an internal util.

Yeah I was planning on adding that along with a find path function

mavoweb / treecle