oguimbal / pgsql-ast-parser

Yet another simple Postgres SQL parser
304 stars 43 forks source link

Unexpected `quoted_word` token when performing Nested Join #153

Open pmooney-socraticworks opened 9 months ago

pmooney-socraticworks commented 9 months ago

I'm using pg-mem@2.8.1 with sequelize@v6, and I'm getting the following error from pg-mem (probably bubbled up from this package), when I include an association via a through relationship. Sequelize models it as a parenthetical LEFT OUTER JOIN with a nested INNER JOIN.

Given the following query:

SELECT
  "todo"."id",
  "todo"."title",
  "labels"."id" AS "labels.id",
  "labels"."color" AS "labels.color",
  "labels"."display_name" AS "labels.display_name",
FROM
  "todo"
  LEFT OUTER JOIN (
    "todo_label" AS "labels->todo_label"
    INNER JOIN "label" AS "labels" ON "labels"."id" = "labels->todo_label"."label_id"
  ) ON "todo"."id" = "labels->todo_label"."todo_id"

I receive this error:

"todo_label"
      ^
Unexpected quoted_word token: "todo_label". Instead, I was expecting to see one of the following:

    - A "kw_with" token
    - A "kw_select" token
    - A "word" token
    - A "kw_with" token
    - A "lparen" token
pmooney-socraticworks commented 8 months ago

After doing some digging, I believe the issue is the grammar for select_table_join. I believe there needs to be another case for this rule, though I am still trying to figure out how to model this, and what the expected AST result should be.

I currently have this as the updated select_table_join:

select_table_join
    -> select_join_op %kw_join select_from_subject select_table_join_clause:? {% x => track(x, {
        ...unwrap(x[2]),
        join: {
            type: toStr(x[0], ' '),
            ...x[3] && unwrap(x[3]),
        }
    }) %}
   # ---- NEW BELOW ----
    | select_join_op %kw_join select_from_item_joins select_table_join_clause {% x => {
        return track(x, {
            ...(flatten(x[2]) || [])[0],
            join: {
                type: toStr(x[0], ' '),
                ...x[3] && unwrap(x[3]),
            } 
        })
    } %}

This will enable the parser to recursively pick out the nested joins, but I'm having issues figuring out the function at the end.

I came up with this test, which I think has the correct expectation:

checkSelect([`SELECT * FROM "ta" 
        LEFT OUTER JOIN ("tb" INNER JOIN "tc" ON "tc"."id" = "tb"."tc_id") 
        ON "ta"."id" = "tc->tb"."ta_id"`], 
                    {
                        type: 'select',
                        columns: [{ expr: star }],
                        from: [
                            tbl('ta'),
                            {
                                name: name('tb'),
                                type: "table",
                                join: {
                                    type: "LEFT JOIN",
                                    on: {
                                        left: {
                                            name: "id",
                                            table: {
                                              name: "ta"
                                            },
                                            type: "ref"
                                        },
                                        op: "=",
                                        right: {
                                            name: "ta_id",
                                            table: {
                                              name: "tc->tb",
                                            },
                                            type: "ref",
                                        },
                                        type: "binary",
                                    },
                                },
                            },
                            {
                                name: name("tc"),
                                type: "table",
                                join: {
                                    type: "INNER JOIN",
                                    on: {
                                        left: {
                                            name: "id",
                                            table: {
                                                name: "tc"
                                            },
                                            type: "ref",
                                        },
                                        op: "=",
                                        right: {
                                            name: "tc_id",
                                            table: {
                                                name: "tb"
                                              },
                                            type: "ref",
                                        },
                                        type: "binary",
                                    },
                                }
                            }
                        ],
                    })

@oguimbal Can you help confirm if this is the correct AST from that query, and if you have any insights in how to get the function to output this please?