jsonata-js / jsonata

JSONata query and transformation language - http://jsonata.org
MIT License
2.03k stars 216 forks source link

Navigating / Flattening Recursive Hierarchies #324

Open fdavies93 opened 5 years ago

fdavies93 commented 5 years ago

I've used JSONata for a little while now, and am now considering how to apply it to more sophisticated problems. One such problem is how to deal with JSON documents where data is hierarchically and recursively defined (e.g. representing a tree).

Given a JSON document structured as follows:

{
    "1": {
        "1": {
            "1": {
                "1": {

                }
            }
        }
    },
    "2": {
        "1": {

        },
        "2": {

        }
    },
    "3": {
        "1": {

        }
    }
}

We would like to output a document where each sub-object is represented using data related to its parents, or vice versa (for example, to transform a tree into a directed graph):

{
    "1": {
        "children": ["1-1"]
    },
    "1-1": {
        "children": ["1-1-1"]
    },
    "1-1-1": {
        "children": ["1-1-1-1"]
    },
    "1-1-1-1": {

    },
    "2": {
        "children": ["2-1","2-2"]
    },
    "2-1": {

    },
    "2-2": {

    },
    "3": {
        "children": ["3-1"]
    },
    "3-1": {

    }
}

Is this possible within current versions of JSONata? It isn't trivial within conventional algorithms, but it's certainly possible by traversing the tree structure of the document and assembling a new document based on that traversal. But I'm not sure how JSONata would handle traversals which (1) have unlimited depth and (2) rely on their position within a traversal for their output.

While it's not common with APIs, there's a wide range of software which uses nested JSON in this way (the same paradigm as HTML / XML). If you were to serialise an arbitrary Javascript object, for example, you'd get very similar output to the first case.

andrew-coleman commented 5 years ago

Just to clarify, what would be allowable in the blank spaces that you've left? Any key/value pair whose value is not an object? In other words, no nested objects at all in the output?

fdavies93 commented 5 years ago

The aim of this would ideally be to take a series of nodes in a tree structure and interpret them as a directed graph, so ideally you'd be able to specify which types of field would be flattened and which wouldn't - probably by using a nested JSONata expression.

So this:

{
    "1": {
        "someOtherField": {},
        "1": {

        }
    }
}

Would become this:

{
    "1": {
        "children": ["1-1"],
        "someOtherField": {}
    },
    "1-1": {

    }
}

I could see a function along the lines of:

$flattenTree(childSelector, input)

$flattenTree(/[0-9]+/,$)

Where the child selector would be a JSONata or regular expression for which fields to include in the child outputs.

Such a function could also have the option to represent the resulting flattened tree with child fields, parent fields, or both.

E: The typical use case of this would be to take deeply nested data and break it into a number of relatively flat documents for storage in a database.