AmpersandTarski / Ampersand

Build database applications faster than anyone else, and keep your data pollution free as a bonus.
http://ampersandtarski.github.io/
GNU General Public License v3.0
40 stars 8 forks source link

Fatal 249 SQL cannot create closures #137

Open Michiel-s opened 9 years ago

Michiel-s commented 9 years ago

Hi Han,

I'm getting a fatal error !fatal 249 (module FSpec.SQL) Ampersand v3.1.0[master:7fedefc] SQL cannot create closures EKl0 (SELECT * FROM NotExistingKl0)

when generating the following script:

CONTEXT Test IN ENGLISH

PATTERN Test
    r :: A * A [UNI,ASY,IRF,INJ]
ENDPATTERN

INTERFACE "Test" : I[A]
  BOX [ "A" : r*]

ENDCONTEXT

Is the transclosure functionality something that I cannot use yet or is this a bug?

stefjoosten commented 9 years ago

Klopt. Dit is nog niet gebouwd.

hanjoosten commented 9 years ago

Transitive closure isn't currently supported by mysql. Hence, since we all use mysql, there is no translation to mysql possible.

However, there are ideas on how this can be solved in another way. See https://www.evolveum.com/transitive-closure/ for the story. But don't hold your breath...

hanjoosten commented 9 years ago

Whenever anyone sees that MySQL (or MariaDB) supports transitive closure, we can reopen this issue. Until then, this feature cannot be supported.

Michiel-s commented 9 years ago

Something to look at later: https://www.evolveum.com/transitive-closure/

Michiel-s commented 3 years ago

As of Mariadb 10.2.2, recursion in SQL queries are possible. See Recursive Common Table Expressions

An example of how I use this in a application:

Notice how the query with name path_query is used in itself and by the SELECT statement below

WITH RECURSIVE path_query (Element, elmName, path, level, messageId) as 
(
  /* The starting list of elements (in this case the message that contains the src element) */ 
  SELECT 
    m.Element, 
    m.elmName, 
    CONCAT(IF(ns.nsAbbr IS NOT NULL, CONCAT(ns.nsAbbr, ':'), ''), m.elmName) as path, 
    0 as level, 
    m.Element 
  FROM element as m /* m for message */
  LEFT JOIN namespace as ns ON (m.elmNamespace = ns.nsURI AND m.Element = ns.nsMessage) 
  WHERE m.Message IS NOT NULL 
  /* Union with it's sub elements recursively */ 
  UNION ALL 
    SELECT 
        e.Element, 
        e.elmName, 
        CONCAT(
            parent.path, 
            /* Don't add namespace prefix for attributes */ 
            IF(e.elmIsAttribute IS NOT NULL, ' / ', CONCAT(
                ' / ', 
                IF(ns.nsAbbr IS NOT NULL, CONCAT(ns.nsAbbr, ':'), '') 
                )), 
            e.elmName), 
        parent.level + 1, 
        parent.messageId
    FROM path_query as parent 
    JOIN element as e ON parent.Element = e.elmIsSubOf 
    LEFT JOIN namespace as ns ON (e.elmNamespace = ns.nsURI AND e.elmIsPartOf = ns.nsMessage) 
    WHERE e.elmIsVisible IS NOT NULL 
) 
SELECT q.Element as atomId, q.path as view_path, q.level, CONCAT('v', v.versionId) as version 
FROM path_query q 
LEFT JOIN specversion v ON v.rootNode = q.messageId 
ORDER BY view_path;
hanjoosten commented 4 months ago

Also see this explanation of the usage

hanjoosten commented 3 days ago

Another example of the idea:

-- create
CREATE TABLE Person (
    id INT PRIMARY KEY,       -- Unique identifier for each person
    name VARCHAR(100),        -- Name of the person
    gender CHAR(1),           -- 'M' for male, 'F' for female
    father_id INT,            -- Reference to the person's father (who is also a person in the same table)
    mother_id INT             -- Reference to the person's mother (who is also a person in the same table)
);

-- Generation 1: The great-grandparents (no parents)
INSERT INTO Person (id, name, gender, father_id, mother_id)
VALUES (1, 'John Sr.', 'M', NULL, NULL),     -- Great-grandfather (father's side)
       (2, 'Jane Sr.', 'F', NULL, NULL),     -- Great-grandmother (father's side)
       (3, 'Robert Sr.', 'M', NULL, NULL),   -- Great-grandfather (mother's side)
       (4, 'Linda Sr.', 'F', NULL, NULL);    -- Great-grandmother (mother's side)

-- Generation 2: The grandparents
INSERT INTO Person (id, name, gender, father_id, mother_id)
VALUES (5, 'John Jr.', 'M', 1, 2),           -- Grandfather (father's side)
       (6, 'Mary', 'F', NULL, NULL),         -- Grandmother (father's side)
       (7, 'Robert Jr.', 'M', 3, 4),         -- Grandfather (mother's side)
       (8, 'Susan', 'F', NULL, NULL);        -- Grandmother (mother's side)

-- Generation 3: The parents
INSERT INTO Person (id, name, gender, father_id, mother_id)
VALUES (9, 'Michael', 'M', 5, 6),            -- Father
       (10, 'Elizabeth', 'F', 7, 8);         -- Mother

-- Generation 4: The person (child)
INSERT INTO Person (id, name, gender, father_id, mother_id)
VALUES (11, 'David', 'M', 9, 10),            -- Son
       (12, 'Anna', 'F', 9, 10);             -- Daughter
-- fetch 
WITH RECURSIVE MaleAncestors AS (
    -- Base case: start with the person's father
    (SELECT p.id, p.name, p.gender, p.father_id
    FROM Person p
    WHERE p.id = (SELECT father_id FROM Person WHERE name = 'Anna') -- Get the father of the given person
    ) 
    UNION ALL
    (
    -- Recursive case: find the male ancestors of the current male ancestor
    SELECT p.id, p.name, p.gender, p.father_id
    FROM Person p
    JOIN MaleAncestors ma ON p.id = ma.father_id
    WHERE p.gender = 'M')
)
SELECT * FROM MaleAncestors;
hanjoosten commented 3 days ago

To get to an idea for what should be generated, let's make this more generic. Forget about the mothers and genders. As we go, also forget about the names:

-- Create the Person table without mother_id and gender
CREATE TABLE Person (
    id INT PRIMARY KEY,       -- Unique identifier for each person
    father_id INT             -- Reference to the person's father (who is also a person in the same table)
);

-- Insert data for multiple generations

-- Generation 1: The great-grandparents (no parents)
INSERT INTO Person (id, father_id)
VALUES (1, NULL),    -- Great-grandfather (father's side)
       (3, NULL);    -- Great-grandfather (mother's side, ignored here)

-- Generation 2: The grandparents
INSERT INTO Person (id, father_id)
VALUES (5, 1),       -- Grandfather (father's side)
       (7, 3);       -- Grandfather (mother's side, ignored here)

-- Generation 3: The parents
INSERT INTO Person (id, father_id)
VALUES (9, 5);       -- Father

-- Generation 4: The person (child)
INSERT INTO Person (id, father_id)
VALUES (11, 9);      -- Son

-- Recursive query to fetch only the ids of all ancestors based on the father's lineage
WITH RECURSIVE Ancestors AS (
    -- Base case: Start with the father of the given person
    SELECT id, father_id
    FROM Person
    WHERE id = (SELECT father_id FROM Person WHERE id = 11)  -- Change 11 to the desired person's ID

    UNION ALL

    -- Recursive case: Continue tracing the father lineage
    SELECT p.id, p.father_id
    FROM Person p
    JOIN Ancestors a ON p.id = a.father_id
)
-- Select only the ancestor ids
SELECT id FROM Ancestors;

This looks even simpler :)

hanjoosten commented 3 days ago

I like chatGPT for helping me out here 😄. The output can be checked online:

Output:

+------+
| id   |
+------+
|    9 |
|    5 |
|    1 |
+------+
hanjoosten commented 3 days ago

Now we only have to fill in some gaps. rename the id and father_id to src and tgt:

-- Create the Person table with src (id) and tgt (father_id)
CREATE TABLE Person (
    src INT PRIMARY KEY,       -- Unique identifier for each person
    tgt INT                    -- Reference to the person's father (who is also a person in the same table)
);

-- Insert data for multiple generations

-- Generation 1: The great-grandparents (no parents)
INSERT INTO Person (src, tgt)
VALUES (1, NULL),    -- Great-grandfather (father's side)
       (3, NULL);    -- Great-grandfather (mother's side, ignored here)

-- Generation 2: The grandparents
INSERT INTO Person (src, tgt)
VALUES (5, 1),       -- Grandfather (father's side)
       (7, 3);       -- Grandfather (mother's side, ignored here)

-- Generation 3: The parents
INSERT INTO Person (src, tgt)
VALUES (9, 5);       -- Father

-- Generation 4: The person (child)
INSERT INTO Person (src, tgt)
VALUES (11, 9);      -- Son

-- Recursive query to fetch only the src (id) of all ancestors based on the tgt (father_id) lineage
WITH RECURSIVE Ancestors AS (
    -- Base case: Start with the father of the given person
    SELECT src, tgt
    FROM Person
    WHERE src = (SELECT tgt FROM Person WHERE src = 11)  -- Change 11 to the desired person's src

    UNION ALL

    -- Recursive case: Continue tracing the father lineage
    SELECT p.src, p.tgt
    FROM Person p
    JOIN Ancestors a ON p.src = a.tgt
)
-- Select only the ancestor src (id)
SELECT src FROM Ancestors;
Output:

+------+
| src  |
+------+
|    9 |
|    5 |
|    1 |
+------+
hanjoosten commented 2 days ago

Almost there. With again some help of chatGPT, this looks very much to something generic enough to implement:

-- Create the Node table (represents the entities in a directed graph)
CREATE TABLE Node (
    src INT PRIMARY KEY,   -- Unique identifier for each node (e.g., vertex in a graph)
    tgt INT                -- Reference to another node indicating a directed relationship (e.g., parent-child or directed edge)
);

-- Insert data representing a simple directed graph
-- Nodes (src) and their directed relationships (tgt)
INSERT INTO Node (src, tgt) VALUES
(1, NULL),  -- Root node (no parent)
(2, 1),     -- Node 2 points to Node 1
(3, 1),     -- Node 3 points to Node 1
(4, 2),     -- Node 4 points to Node 2
(5, 2),     -- Node 5 points to Node 2
(6, 3),     -- Node 6 points to Node 3
(7, 3),     -- Node 7 points to Node 3
(8, 4),     -- Node 8 points to Node 4
(9, 5);     -- Node 9 points to Node 5

-- Recursive query to compute the transitive closure of the graph
WITH RECURSIVE TransitiveClosure AS (
    -- Base case: Start with the node directly connected to the given node
    SELECT src, tgt
    FROM Node
    WHERE src = (SELECT tgt FROM Node WHERE src = 9)  -- Replace 9 with the initial node identifier

    UNION ALL

    -- Recursive case: Continue following the directed relationships in the graph
    SELECT n.src, n.tgt
    FROM Node n
    JOIN TransitiveClosure tc ON n.src = tc.tgt
)
-- Final step: Retrieve the initial node and all other nodes reachable via the transitive closure
SELECT 9 AS src, src as tgt -- Replace 9 with the desired node identifier
FROM TransitiveClosure;

This yields the desired output:

+-----+------+
| src | tgt  |
+-----+------+
|   9 |    5 |
|   9 |    2 |
|   9 |    1 |
+-----+------+
hanjoosten commented 2 days ago

Now I only have to find the correct data structure in Haskell in the sql library that we use, so this query can be constructed.

hanjoosten commented 2 days ago
-- Create the Node table (represents the entities in a directed graph)
CREATE TABLE Node (
    src INT PRIMARY KEY,   -- Unique identifier for each node (e.g., vertex in a graph)
    tgt INT                -- Reference to another node indicating a directed relationship (e.g., parent-child or directed edge)
);

-- Insert data representing a simple directed graph
-- Nodes (src) and their directed relationships (tgt)
INSERT INTO Node (src, tgt) VALUES
(2, 1),     -- Node 2 points to Node 1
(3, 1),     -- Node 3 points to Node 1
(4, 2),     -- Node 4 points to Node 2
(5, 2),     -- Node 5 points to Node 2
(6, 3),     -- Node 6 points to Node 3
(7, 3),     -- Node 7 points to Node 3
(8, 4),     -- Node 8 points to Node 4
(9, 5);     -- Node 9 points to Node 5

-- Recursive query to compute the transitive closure of the graph
WITH RECURSIVE TransitiveClosure AS (
    -- Base case: Start with the node directly connected to the given node
    SELECT src, tgt
    FROM Node

    UNION ALL

    -- Recursive case: Continue following the directed relationships in the graph
    SELECT TransitiveClosure.src, Node.tgt
    FROM Node 
    JOIN TransitiveClosure ON Node.src = TransitiveClosure.tgt
)
-- Final step: Retrieve the initial node and all other nodes reachable via the transitive closure
SELECT * FROM TransitiveClosure;