machow / siuba

Python library for using dplyr like syntax with pandas and SQL
https://siuba.org
MIT License
1.14k stars 48 forks source link

Use LazyTbl without an engine? #424

Open randy3k opened 2 years ago

randy3k commented 2 years ago

Could we use LazyTbl without specifying an engine? I just want to use siuba to generate SQL query, similar to what pypika does.

machow commented 2 years ago

Hey--does this approach (using a mock engine) work?

from siuba.sql.utils import mock_sqlalchemy_engine
from siuba.sql import LazyTbl as _LazyTbl
cars_sql = _LazyTbl(
        mock_sqlalchemy_engine("postgresql"),
        "cars",
        ["cyl", "mpg", "hp"]
        )

# example showing the query
from siuba import _, mutate, show_query

q = cars_sql >> mutate(avg = _.hp - _.hp.mean()) >> show_query()

outputs:

SELECT cars.cyl, cars.mpg, cars.hp, cars.hp - avg(cars.hp) OVER () AS avg
FROM cars

Currently, show_query just prints the SQL and returns a LazyTbl, but I can add the equivalent of dbplyr's db_sql_render (or add an argument to show_query to return the generated string)

susannvorberg commented 6 months ago

I would also be interested in getting the SQL query and vote for an argument to show_query() to return the generated string.