paleobot / pbot-api

MIT License
2 stars 0 forks source link

UNWINDs throughout could be Hogan's goat #10

Closed NoisyFlowers closed 2 years ago

NoisyFlowers commented 2 years ago

While developing Group maintenance logic, I created this cypher for creation:

                MATCH 
                    (ePerson:Person {pbotID: $data.enteredByPersonID})
                CREATE 
                    (group: Group {
                        pbotID: apoc.create.uuid(),
                        name: $data.name
                    })-[:ENTERED_BY {timestamp: datetime(), type:"CREATE"}]->(ePerson)
                WITH group
                UNWIND $data.members AS mID
                    MATCH (member:Person {pbotID: mID}) 
                    CREATE (member)-[:MEMBER_OF]->(group)
                WITH collect(mID) as dummy, group
                UNWIND $data.items as iID
                    MATCH (n) WHERE n.pbotID = iID
                    CREATE (n)-[:ITEM_OF]->(group)
                WITH collect(iID)as dummy, group
                RETURN group

I set up data parameters thus:

:params {data: {name: "Grp-ddm-01-21c", members:["01cde18f-43a4-4337-af80-e0aa84459ca0", "b834c9f2-894f-40a4-8447-661a95c21f73"], enteredByPersonID: "01cde18f-43a4-4337-af80-e0aa84459ca0"}}

Note that there are no items.

Running this cypher creates the Group node just fine, but returns nothing. Why?

Because the UNWIND for data.items has nothing to iterate and the final RETURN group is scoped in that UNWIND. There is no way to break out of UNWIND scope and return to a higher level in cypher. The only solution is to game the system and force UNWIND to return a record. Here is a wordy but reasonable approach:

                MATCH 
                    (ePerson:Person {pbotID: $data.enteredByPersonID})
                CREATE 
                    (group: Group {
                        pbotID: apoc.create.uuid(),
                        name: $data.name
                    })-[:ENTERED_BY {timestamp: datetime(), type:"CREATE"}]->(ePerson)
                WITH group
                UNWIND $data.members AS mID
                    MATCH (member:Person {pbotID: mID}) 
                    CREATE (member)-[:MEMBER_OF]->(group)
                WITH collect(mID) as dummy, group
                UNWIND (CASE coalesce($data.items,0) WHEN 0 THEN [null] ELSE $data.items END) as iID
                    MATCH (n) WHERE n.pbotID = iID
                    CREATE (n)-[:ITEM_OF]->(group)
                WITH collect(iID)as dummy, group
                RETURN group

Using coalesce handles the case of both an empty array and a null.

Running this creates another Group node but still returns nothing. Why?

Because the MATCH inside the the UNWIND finds no node with pbotID = null. We could try an OPTIONAL MATCH, but this causes the whole statement to error out because the CREATE fails.

So, I came up with this monstrosity:

                MATCH 
                    (ePerson:Person {pbotID: $data.enteredByPersonID})
                CREATE 
                    (group: Group {
                        pbotID: apoc.create.uuid(),
                        name: $data.name
                    })-[:ENTERED_BY {timestamp: datetime(), type:"CREATE"}]->(ePerson)
                WITH group
                UNWIND $data.members AS mID
                    MATCH (member:Person {pbotID: mID}) 
                    CREATE (member)-[:MEMBER_OF]->(group)
                WITH collect(mID) as dummy, group
                UNWIND (CASE coalesce($data.items,0) WHEN 0 THEN [null] ELSE $data.items END) AS iID
                    CALL
                        apoc.do.when(
                            iID IS NULL,
                            "RETURN group",
                            "MATCH (n) WHERE n.pbotID = iID CREATE (n)-[:ITEM_OF]->(group) RETURN group",
                            {iID:iID, group:group}
                        ) YIELD value
                WITH collect(iID)as dummy, group
                RETURN group

This works. A Group node is created and returned. But, sheeesh.

Also, note that if there are no members in the data parameters, everything after that UNWIND would fail to execute. This means if there were items but no members, they would not be added at all.

But here's the kicker: this problem exists everywhere we use an UNWIND on something that is optional. This is pretty much every UNWIND, and there are many.

Sheeesh.

NoisyFlowers commented 2 years ago

Commit 65d753bec768240174840ff75fb1f320cc8c7e10 for Group node maintenance is the model for how we should address this problem. Any UNWIND that can be (i.e. any that contains only an updating command, no MATCHes) should be replaced with FOREACH. Any that can't be replaced must use the coalesce and apoc.do.when approach displayed above and in the commit.

I'll need to test each cypher command after these changes are made. It'll be a little sweaty.

NoisyFlowers commented 2 years ago

This is complete with commits 65d753bec768240174840ff75fb1f320cc8c7e10 through 821a214bb0c2973163daae3ce8eda0ee85e784fb.

Since there was so much cypher to rewrite and test, we decided to take it as an opportunity to move create and update routines into Resolvers.js.

Because we are now building the cypher with javascript, we can move some of the logic into that environment and avoid the need for the coalesce and apoc.do.when described above. Instead, we have this:

    queryStr = relationships.reduce((str, relationship) => {
        if (data[relationship.graphqlName] && data[relationship.graphqlName].length > 0) {
            return `
                ${str}
                    UNWIND ${JSON.stringify(data[relationship.graphqlName])} AS iD
                        MATCH (remoteNode) WHERE remoteNode.pbotID = iD 
                        CREATE (baseNode)${relationship.direction === "in" ? "<-" : "-"}[:${relationship.type}]${relationship.direction === "in" ? "-" : "->"}(remoteNode)
                    WITH distinct baseNode
            `
        } else {
            if (relationship.required) {
                throw new ValidationError(`Missing required relationship ${relationship.graphqlName}`);
            } else {
                return str;
            }
        }
    }, queryStr);

Because we can check the existence of the relationships in javascript, we end up with simpler cypher that looks very much like what we started with above.

We can also throw an error if a required relationship is missing. Bonus!