modin-project / modin

Modin: Scale your Pandas workflows by changing a single line of code
http://modin.readthedocs.io
Apache License 2.0
9.59k stars 647 forks source link

BUG: #7324

Open SiRumCz opened 1 week ago

SiRumCz commented 1 week ago

Modin version checks

Reproducible Example

...
# suppose I already have connection to a sqlite db with table named tabl1 and it has `id` attr.

df = pd.read_sql(sql='select * from table1 where id=:id', con=con, params={'id':'id1'})

### Issue Description

error is raised when I execute this.

### Expected Behavior

should give me correct result

### Error Logs

<details>

```python-traceback

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near "%": syntax error
[SQL: SELECT COUNT(*) FROM (select * from table1 where id=:id) AS _MODIN_COUNT_QUERY]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

Installed Versions

INSTALLED VERSIONS ------------------ commit : 52fca1ccaf8f4623688955f724f504a5e80c332c python : 3.9.17.final.0 python-bits : 64 OS : Linux OS-release : 6.2.0-37-generic Version : #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_CA.UTF-8 LOCALE : en_CA.UTF-8 Modin dependencies ------------------ modin : 0.30.1 ray : 2.24.0 dask : 2024.6.0 distributed : 2024.6.0 hdk : None pandas dependencies ------------------- pandas : 2.2.2 numpy : 1.24.4 pytz : 2024.1 dateutil : 2.9.0.post0 setuptools : 70.0.0 pip : 24.0 Cython : None pytest : 7.4.4 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : 8.18.1 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : 2024.6.0 gcsfs : None matplotlib : 3.7.5 numba : None numexpr : None odfpy : None openpyxl : 3.1.4 pandas_gbq : None pyarrow : 16.1.0 pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.10.1 sqlalchemy : 2.0.31 tables : None tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None
noloerino commented 1 week ago

Hi @SiRumCz, is your issue that pd.read_sql is raising a syntax error with your query?

Can you see if your code works with vanilla (non-Modin) pandas? You might need to try different parameter syntax (https://peps.python.org/pep-0249/#paramstyle) to get your query to work.