kayak / pypika

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
http://pypika.readthedocs.io/en/latest/
Apache License 2.0
2.47k stars 293 forks source link

Use of aliased tables in functions generates incorrect SQL #645

Open davidpurser opened 2 years ago

davidpurser commented 2 years ago

Problem

Using a table as an argument to a function works correctly, until the table is aliased. Consider the following code:

products = Table('products').as_('p')
test_query = Query.from_(products).select(fn.Count(products))
print(str(test_query))
# SELECT COUNT("products" "p") FROM "products" "p"

Expected:

SELECT COUNT("p") FROM "products" "p"

Passing a table to a function is a supported operation in PostgreSQL and passes the row data, it can be used for functions such as SELECT custom_convert_to_json("p") FROM "products" "p" (which is where I originally encountered the issue). Note that SELECT COUNT("p".*) FROM "products" "p" is also an accepted syntax which does not currently seem possible to generate with PyPika.

Workaround

Removing the table alias works fine, but sometimes isn't an option due to name conflicts. In those cases, creating a fake "table" with the same name as the alias also works fine. This is what I'm doing for now.

products = Table('products').as_('p')
fake_p_table = Table('p')
test_query = Query.from_(products).select(fn.Count(fake_p_table))
print(str(test_query))
# SELECT COUNT("p") FROM "products" "p"

I took a quick look in the source for converting function parameters and it seems to me like changing this behavior isn't straightforward without either adding special case handling when isinstance(Table) (which would have to be done for all functions), or modifying the str() behavior of aliased Tables, which would of course have other implications.

ajustintrue commented 2 years ago

Hi @davidpurser, Besides the table alias you can also pass *, any literal, any field and other terms in most RDBMS.

Have you tried passing anything else? This should work fine for your example:

products = Table('products').as_('p')
product_id = Field('product_id', table=products)
test_query = Query.from_(products).select(fn.Count(product_id))

note: using a field like the above will also support specified counts from any side of a join if you are not doing an inner join.