defog-ai / sql-eval

Evaluate the accuracy of LLM generated outputs
Apache License 2.0
447 stars 47 forks source link

[BUG] Unable to run `compare_query_results` when using psycopg #192

Open a3huang opened 1 week ago

a3huang commented 1 week ago

I'm having trouble getting the compare_query_results function to work with my local postgres database while using psycopg. When I run something like:

compare_query_results(
    query_gold, 
    query_gen, 
    "name", 
    "postgres",
    {"user": "postgres",
     "password": "postgres",
     "host": "localhost",
     "port": 5492},
    "some question",
    ""
)

I get the following error:

AttributeError: 'OptionEngine' object has no attribute 'execute'

Looking into the above, the error seems to be coming from sqlalchemy. When I downgrade to sqlalchemy<2.0.0, I instead get the following:

Can't load plugin: sqlalchemy.dialects:postgresql.psycopg

The problem seems to be with how sqlalchemy interacts with the pd.read_sql_query function in pandas. It seems like in SQLAlchemy2.0, we need to pass in a connection object rather than an engine object into pd.read_sql_query. In addition, it seems like we cannot pass in the raw query string directly into pd.read_sql_query, but should wrap it with the text method from sqlalchemy. The place in the code where it errors out seems to be here: https://github.com/defog-ai/sql-eval/blob/main/eval/eval.py#L223-L225

When I run the following:

pd.read_sql_query(text(query), engine)

I get the same errors as above. However, when I run the following:

conn = engine.connect()
pd.read_sql_query(text(query), conn)

I get results as expected.

Here are my relevant package versions:

pandas==1.5.3
psycopg==3.1.19
sqlalchemy==2.0.31
rishsriv commented 1 week ago

Hi there, I would recommend using the packages in requirements.txt. Specifically, you should psycopg2 instead of psycopg. You can do this with pip install --upgrade psycopg2-binary