Rearchitect the LabelBasedGraph / Disambiguated Graph

The LabelBasedGraph is becoming further integrated into arches-core and various arches projects; this increased use has resulted in a direct need for a more descriptive data structure, additional data returned, and reduced database queries. ( Also, we should consolidate around a name )

Specifically, the proposed changes are:

Add an @type key ( or equivalent name ) into each node-object returned by the graph. This is meant to self describe the shape of the node-object and has the following values ( or equivalent names): node, nodeset, resource_instance.
nodeset node-objects' @value key should always be an array, and should not contain an @tileid key.
@displaydescription, @displayname, @graph_id, @legacyid, @map_popup, and @resourceinstanceid to resource_instance type
The @value key should only exist if the node is data-collecting.
If a parent node exists, all child nodes should have a node-object even if they do not contain data. Currently this logic is based on tile presence/absence.

Consider a simplified Person resource:

    {
        "@displaydescription": 'DISPLAY DESCRIPTION', 
        "@displayname": 'DISPLAY NAME', 
        "@graph_id": '1111111-11111-1111111', 
        "@legacyid": null, 
        "@map_popup": null,
        "@resourceinstanceid": '2222222-22222-222222222',
        "@type": "resource_instance",
        "Name": {
            "@type": "node",
            "@node_id": "333333-3333-333333333",
            "@tile_id": "444444-4444-444444444",
            "First Name": {
                "@type": "node",
                "@node_id": "55555-55555-5555555",
                "@tile_id": "66666-66666-6666666",
                "@value": "Bob",
            },
            "Last Name": null,
        },
        "Nicknames": {
            "@type": "nodeset",
            "@node_id": "77777-7777-7777777",
            "@value": [
                {
                    "@type": "node",
                    "@node_id": "77777-7777-7777777",
                    "@tile_id": "99999-99999-999999",
                    "@value": "Bobert"
                }
            ],
        },
        "Knacknames": [],
        "Eye Color": null,
        "Pets": {
            "@type": "nodeset",
            "@node_id": "AAAAA-AAAA-AAAAAAA",
            "@value": [
                {
                    "@type": "node",
                    "@node_id": "AAAAA-AAAA-AAAAAAA",
                    "@tile_id": "CCCCC-CCCCC-CCCCCC",
                    "@value": "Bark Ruffalo"
                },
                {
                    "@type": "node",
                    "@node_id": "AAAAA-AAAA-AAAAAAA",
                    "@tile_id": "DDDDD-DDDDD-DDDDDD",
                    "@value": "Meowry Pawvitch"
                }
            ],
        }
    }

Without having seen the Person graph, we can infer a good deal from this:

The top-level nodes are Name, Nicknames, Knacknames, Eye Color, and Pets.
Name does not collect data
Name has 2 direct child nodes First Name, and Last Name
Last Name was not instantiated
Nicknames collects data and has cardinality n
Knacknames has cardinality n, and was not instantiated
Eye Color has cardinality 1, and was not instantiated
Pets collects data and has cardinality n

However, some questions remain:

should resource_instance types embrace proper delineation, or stay inline with the previous versions? ( eg @display_description vs @displaydescription )
Should we incorporate an @nodes key, or leave child nodes as top-level keys of node names?
If a node is not instantiated, we do not know the node_id, should we change the shape to accommodate?
@type is derived from cardinality. Should we not use @type?
Should we include @datatype? Doing so could give us a unique solution for handling related resources in the future.
Should we include @cardinality?
In the nodeset type, there is duplication of @node_id. Is that acceptable?
The status quo of map_popup in "undefined", and that's the only place where such a value exists in the LabelBasedGraph. Should we continue only using null, or should we also include undefined? The logic to differentiate would be the same as in JavaScript, loosely: undefined = was not instantiated, null = saved without a value.

Personally, I feel if we tweak the proposed pattern a little that some questions will answer themselves. The above Person LabelBasedGraph is roughly what fell out of committee; below is my personal spin on the proposed pattern:

    {
        "@display_description": 'DISPLAY DESCRIPTION', 
        "@display_name": 'DISPLAY NAME', 
        "@graph_id": '1111111-11111-1111111', 
        "@legacy_id": undefined, 
        "@map_popup": null,
        "@resource_instance_id": '2222222-22222-222222222',
        "@datatype": "resource_instance",
        "@nodes": [
            {
                "@display_name": "Name",
                "@datatype": "semantic",
                "@cardinality": "n",
                "@node_id": "333333-3333-333333333",
                "@tile_id": "444444-4444-444444444",
                "@direct_children": [
                    {
                        "@display_name": "First Name",
                        "@datatype": "string",
                        "@node_id": "55555-55555-5555555",
                        "@tile_id": "66666-66666-6666666",
                        "@value": "Bob",
                    },
                    {
                        "@display_name": "Last Name",
                        "@datatype": "string",
                        "@node_id": "55555-55555-5555555",
                        "@value": undefined,
                    }
                ]
            },
            {
                "@display_name": "Nicknames",
                "@datatype": "string",
                "@cardinality": "n",
                "@node_id": "77777-7777-7777777",
                "@value": [
                    {
                        "@tile_id": "99999-99999-999999",
                        "@value": "Bobert"
                    }
                ],
            },
            {
                "@display_name": "Knacknames",
                "@datatype": "string",
                "@cardinality": "n",
                "@node_id": "88888-88888-88888888",
                "@value": undefined
            },
            {
                "@display_name": "Eye Color",
                "@datatype": "string",
                "@cardinality": 1,
                "@node_id": "95499-94599-99945999",
                "@value": undefined
            },
            {
                "@display_name": "Pets",
                "@datatype": "resource_instance_list",
                "@cardinality": "n",
                "@node_id": "AAAAA-AAAA-AAAAAAA",
                "@value": [
                    {
                        "@datatype": "resource_instance",
                        "@cardinality": 1,
                        "@tile_id": "CCCCC-CCCCC-CCCCCC",
                        "@value": "Bark Ruffalo"
                    },
                    {
                        "@datatype": "resource_instance",
                        "@cardinality": 1,
                        "@tile_id": "DDDDD-DDDDD-DDDDDD",
                        "@value": "Meowry Pawvitch"
                    }
                ],
            },
        ]
    }

With the above pattern, we get the following wins:

by adding the @nodes and @display_name keys, every key becomes explicit. Having keys be node names in the uncompacted graph can make iteration tricky. We could also drop the @ prefix.
Not having an @type key. Instead, the shape of @value is derived from @datatype + @cardinality
Semantic nodes do not have an @value key
null and undefined values for @value. This will help distinguish uninstantiated nodes from instantiated nodes with no value.
no duplication of @node_id to @values of cardinality n.
including @datatype allows for later graph expansion ( eg show all related resource graphs )

However, some issues still remain:

Uninstantiated nodes ( eg Last Name ) can look instantiated.
undefined only makes sense with JS consumption, is there a more reliable way to show instantiation?
what should the rules of @nodes be? Should it only exist when there are children?
Should uninstantiated cardinality n nodes ( eg Knacknames ), be undefined or []?

@chrabyrd it feels this is a significant change to the structure. The great thing about the original one was you could following the structure using the node names as keys, making it much quicker to use it in the UI.

The HTML export templates are built around this principal. You could chain through the node names and then retrieve the @display_value...

{{ resource_data|val_from_key:"System Reference Numbers"|val_from_key:"PrimaryReferenceNumber"|val_from_key:"Primary Reference Number"|val_from_key:"@display_value" }}

The original structure provided that "human readable" structure which felt like the original purpose. v2 added some additional metadata and the more consistent @display_value, which was good but generally maintained the same overall pattern.

Adding additional @metadata attributes within the existing structure would be more preferable.

should the cardinality value be different datatypes?

archesproject / arches

Rearchitect the LabelBasedGraph / Disambiguated Graph #8032