opendp / smartnoise-sdk

Tools and service for differentially private processing of tabular and relational data
MIT License
254 stars 69 forks source link

Problem with any SQL query due to pandas/SQLAlchemy #529

Closed ricardocarvalhods closed 1 year ago

ricardocarvalhods commented 1 year ago

Hello,

If you install the lib from scratch today, it gives error on basically any SQL query it tries to execute. See the error below.

According to this issue:

That's probably the case, I tried SQLAlchemy==1.4.46 and the error was gone.

Error:

File "/home/vscode/.local/lib/python3.9/site-packages/snsql/sql/reader/pandas.py", line 148, in execute
    q_result = sqldf(clean_query(query), locals())
  File "/home/vscode/.local/lib/python3.9/site-packages/pandasql/sqldf.py", line 156, in sqldf
    return PandaSQL(db_uri)(query, env)
  File "/home/vscode/.local/lib/python3.9/site-packages/pandasql/sqldf.py", line 61, in __call__
    result = read_sql(query, conn)
  File "/home/vscode/.local/lib/python3.9/site-packages/pandas/io/sql.py", line 590, in read_sql
    return pandas_sql.read_query(
  File "/home/vscode/.local/lib/python3.9/site-packages/pandas/io/sql.py", line 1560, in read_query
    result = self.execute(*args)
  File "/home/vscode/.local/lib/python3.9/site-packages/pandas/io/sql.py", line 1405, in execute
    return self.connectable.execution_options().execute(*args, **kwargs)
  File "/home/vscode/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
    raise exc.ObjectNotExecutableError(statement) from err
sqlalchemy.exc.ObjectNotExecutableError: Not an executable object: 'SELECT COUNT(*) AS keycount, marital_status AS marital_status, SUM(age) AS sum_age FROM ( SELECT marital_status AS marital_status, CASE WHEN age < 0 THEN 0 WHEN age > 110 THEN 110 ELSE  age END AS age FROM df_for_diffpriv1234 ) AS per_key_all GROUP BY marital_status'
joshua-oss commented 1 year ago

Thanks for raising this issue. I've pushed version 0.2.9.1 of smartnoise-sql to PyPi which pins SQLAlchemy to <0.2. New installs should now work again. This is a good reminder for us to remove our dependency on pandasql, since pandas has subsumed that functionality. We will investigate removing this dependency so that smartnoise-sql can work with sqlalchemy versions >= 2.0.0.

Support for newer sqlalchemy will depend on recently-merged PR [https://github.com/pandas-dev/pandas/pull/48576]. Will add support when that PR makes it to PyPi.

joshua-oss commented 1 year ago

Fix is deployed in smartnoise-sql==1.0.0. Please let us know if any issues remain.

joshua-oss commented 1 year ago

Fixed