PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
hey there! thank you for this amazing library. i love using it!
i experienced an error which i believe is related to some choices when constructing the query. i have only experienced this in my case, which is PostgreSQL (with sqlalchemy/psycopg2). in this toy example, i have two tables: A and B. they both have a column called foo.
'SELECT "A"."foo" "foo","B"."bar" "bar" FROM "A" JOIN "B" ON "A"."id"="B"."a_id" GROUP BY "foo"'
however, running this leads to the following error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.AmbiguousColumn) column reference "foo" is ambiguous
if i construct the query in the following way, it runs without a problem (because the group by column is properly namespaced):
query_noproblem = ppk.Query.from_(a_tbl).select(a_foo_col, b_tbl.bar.as_("bar")).join(b_tbl).on(
a_tbl.id == b_tbl.a_id
).groupby(
a_foo_col.get_sql() # this causes the GROUP BY to be namespaced for some reason
)
str(query_noproblem)
# => 'SELECT "A"."foo" "foo","B"."bar" "bar" FROM "A" JOIN "B" ON "A"."id"="B"."a_id" GROUP BY "A"."foo"'
as for group by'ing on a column from B (not the main table selection), the behaviour is slightly different but it still doesn't seem entirely correct:
.groupby(b_tbl.foo.as_("foo")) will lead to GROUP BY "foo" which would lead to an ambiguity error
.groupby(b_tbl.foo) properly produces GROUP BY "b"."foo" which is correct
so it seems like at least in the group by clause (and maybe others), at least in postgresql, the column always needs to be namespaced properly to avoid possible ambiguity errors.
i hope this example is reproducible, i changed the column & table names to simplify the query, but i'm not able to test it on a real psql database right now.
hey there! thank you for this amazing library. i love using it!
i experienced an error which i believe is related to some choices when constructing the query. i have only experienced this in my case, which is PostgreSQL (with sqlalchemy/psycopg2). in this toy example, i have two tables: A and B. they both have a column called
foo
.if i run the following code:
this is the resulting query:
however, running this leads to the following error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.AmbiguousColumn) column reference "foo" is ambiguous
if i construct the query in the following way, it runs without a problem (because the group by column is properly namespaced):
as for group by'ing on a column from B (not the main table selection), the behaviour is slightly different but it still doesn't seem entirely correct:
.groupby(b_tbl.foo.as_("foo"))
will lead toGROUP BY "foo"
which would lead to an ambiguity error.groupby(b_tbl.foo)
properly producesGROUP BY "b"."foo"
which is correctso it seems like at least in the group by clause (and maybe others), at least in postgresql, the column always needs to be namespaced properly to avoid possible ambiguity errors.
i hope this example is reproducible, i changed the column & table names to simplify the query, but i'm not able to test it on a real psql database right now.