staudenmeir / laravel-adjacency-list

Recursive Laravel Eloquent relationships with CTEs
MIT License
1.34k stars 109 forks source link

[Feature request] Support bloodline (bidirectional) relationships in graph #265

Open mpskovvang opened 4 days ago

mpskovvang commented 4 days ago

The bloodline is a powerful relationship in trees, but unfortunately, it is not implemented in graphs.

If possible, this feature would enable developers to query interconnected nodes more effectively and broaden the potential use cases for this package.

Suggested Implementation

I have minimal experience with recursive queries, but I believe the following SQL query could serve as a starting point for implementing this feature in a graph context:

WITH RECURSIVE laravel_cte AS (
    -- Anchor query: Start from node
    SELECT 
        source_id AS id, 
        target_id, 
        0 AS depth, 
        ARRAY[source_id, target_id] AS path
    FROM edges
    WHERE source_id = ? OR target_id = ?

    UNION ALL

    -- Recursive query: Continue tracing both directions while avoiding revisits
    SELECT 
        CASE WHEN edges.source_id = laravel_cte.id THEN edges.target_id ELSE edges.source_id END AS id,
        CASE WHEN edges.source_id = laravel_cte.id THEN edges.target_id ELSE edges.source_id END AS target_id,
        laravel_cte.depth + 1 AS depth, 
        laravel_cte.path || CASE WHEN edges.source_id = laravel_cte.id THEN edges.target_id ELSE edges.source_id END AS path
    FROM edges
    JOIN laravel_cte ON (edges.source_id = laravel_cte.id OR edges.target_id = laravel_cte.id)
    WHERE NOT CASE WHEN edges.source_id = laravel_cte.id THEN edges.target_id ELSE edges.source_id END = ANY(laravel_cte.path)
)

-- Final selection: Return distinct node IDs connected to 58
SELECT DISTINCT id
FROM laravel_cte;

However, I'm a bit concerned that the use of OR and CASE conditions within the recursive query may lead to performance bottlenecks, especially with large datasets

mpskovvang commented 4 days ago

I rewrote the SQL query based on the existing ancestors HasManyDeep relationship. The next step is testing on a larger dataset to qualify the best approach.

WITH RECURSIVE "laravel_cte" AS (
    (
        -- Selecting the target relationships
        SELECT 
            "nodes".id, 
            -1 AS "depth", 
            ARRAY["nodes"."id"] AS "path", 
            "edges"."source_id" AS "pivot_source_id", 
            "edges"."target_id" AS "pivot_target_id"
        FROM "nodes"
        INNER JOIN "edges" 
            ON "nodes"."id" = "edges"."source_id"
        WHERE "edges"."target_id" = ?
    )
    UNION ALL
    (
        -- Selecting the source relationships
        SELECT 
            "nodes".id, 
            -1 AS "depth", 
            ARRAY["nodes"."id"] AS "path", 
            "edges"."source_id" AS "pivot_source_id", 
            "edges"."target_id" AS "pivot_target_id"
        FROM "nodes"
        INNER JOIN "edges" 
            ON "nodes"."id" = "edges"."target_id"
        WHERE "edges"."source_id" = ?
    )
    UNION ALL
    (
        -- Recursive query: Continue tracing the connections and avoid cycles
        SELECT 
            "nodes".id, 
            "depth" - 1 AS "depth", 
            "path" || "nodes"."id" AS "path", 
            "edges"."source_id" AS "pivot_source_id", 
            "edges"."target_id" AS "pivot_target_id"
        FROM "nodes"
        INNER JOIN "edges" 
            ON "nodes"."id" = "edges"."source_id" OR "nodes"."id" = "edges"."target_id"
        INNER JOIN "laravel_cte" 
            ON "laravel_cte"."id" = "edges"."target_id" OR "laravel_cte"."id" = "edges"."source_id"
        WHERE NOT ("nodes"."id" = ANY("path"))
    )
)
-- Final selection: Join the recursive results with the nodes and relations
SELECT DISTINCT ON ("nodes"."id") *
FROM "nodes"
INNER JOIN "edges" 
    ON "edges"."target_id" = "nodes"."id" OR "edges"."source_id" = "nodes"."id"
INNER JOIN "laravel_cte" 
    ON "laravel_cte"."id" = "edges"."source_id" OR "laravel_cte"."id" = "edges"."target_id";
staudenmeir commented 3 days ago

Hi @mpskovvang, I looked into this topic when I wrote the graph feature and there was a reason why I didn't include it, but I don't remember the details. I'll see what I can find/remember.

Do you have a use case for this in your application?

mpskovvang commented 3 days ago

Thanks @staudenmeir

The use case is multilingual recipes connected by localized keywords.

Each recipe has a few main keywords in the same language as the recipe. The keywords are translated into other languages and connected.

Let's say the main keyword of recipe A is 55 and the direct translations of that keyword are 56-60.

The main keyword of recipe B is 58, and the direct translations of that keyword are 56, 57, 59, and 60 + a new 81 instead of 55.

I want to fetch all related keywords to 58 in both directions (55-60 + 81).

Unavngivet diagram drawio (3)

Potentially, nodes 56, 57, 59, and 60 could have other connections that are also related to 58.

I've done some more query testing, but haven't found a way to improve or simplify further.

The cycle detection works all right, but as a developer, I guess you have to pay attention to duplicates in the result because nodes are connected in several ways (multiple/different paths).

mpskovvang commented 3 days ago

Although the diagram in my previous comment is possible and a valid case as well, my testing dataset looks like this:

Unavngivet diagram drawio (4)

58 <- 55 -> 56 -> 81