Closed lmmentel closed 1 month ago
The case that still throws an error: pandas 2.2.2 and sqlalchemy 1.4.52
>>> from mendeleev.fetch import fetch_table, get_engine
>>> df = fetch_table("elements")
/home/lukasz/projects/mendeleev/mendeleev/fetch.py:71: UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy.
return pd.read_sql_query(sql=text(query), con=conn, **kwargs)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lukasz/projects/mendeleev/mendeleev/fetch.py", line 71, in fetch_table
return pd.read_sql_query(sql=text(query), con=conn, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lukasz/.cache/pypoetry/virtualenvs/mendeleev-ZDKABmhr-py3.12/lib/python3.12/site-packages/pandas/io/sql.py", line 526, in read_sql_query
return pandas_sql.read_query(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/lukasz/.cache/pypoetry/virtualenvs/mendeleev-ZDKABmhr-py3.12/lib/python3.12/site-packages/pandas/io/sql.py", line 2738, in read_query
cursor = self.execute(sql, params)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lukasz/.cache/pypoetry/virtualenvs/mendeleev-ZDKABmhr-py3.12/lib/python3.12/site-packages/pandas/io/sql.py", line 2670, in execute
raise TypeError("Query must be a string unless using sqlalchemy.")
TypeError: Query must be a string unless using sqlalchemy.
There's a known issue https://github.com/pandas-dev/pandas/issues/57049 that latest pandas breaks compatibility with sqlalchemy 1.4.x which is still officially supported.
I think mendeleev should depend on pandas[sql-other] to prevent installation of incomatible pandas and sqlalchemy in the same environment.
I think mendeleev should depend on pandas[sql-other] to prevent installation of incomatible pandas and sqlalchemy in the same environment.
Could you elaborate on what you mean here?
Current dependencies from pyproject.toml
are:
pandas = [
{ version = "^2.1", python = ">=3.12,<3.13" },
{ version = ">=1.0.0", python = "<3.12" },
]
SQLAlchemy = ">=1.4.0"
so both pandas
and sqlalchemy
are core dependencies for mendeleev
.
I think mendeleev should depend on pandas[sql-other] to prevent installation of incomatible pandas and sqlalchemy in the same environment.
Could you elaborate on what you mean here?
Current dependencies from
pyproject.toml
are:pandas = [ { version = "^2.1", python = ">=3.12,<3.13" }, { version = ">=1.0.0", python = "<3.12" }, ] SQLAlchemy = ">=1.4.0"
so both
pandas
andsqlalchemy
are core dependencies formendeleev
.
What pip denote within square brackets are called extras and the syntax for poetry seems to be
pandas = [
{ version = "^2.1", python = ">=3.12,<3.13", extras = ["sql-other"] },
{ version = ">=1.0.0", python = "<3.12", extras = ["sql-other"] },
]
Didn't know about pandas extras, thanks for sharing. I'm not convinced about adding this since it will bump sqlalchemy to min 2.0.0 ref. Since sqlachemy is at the core of mendeleev
and 1.4.x is still officially supported I wouldn't want to drop the support for these versions prematurely.
When it's dropped I think the same effect can be achieved explicitly by bumping the sqlalchemy version directly in pyproject.toml.
This is not really a mendeleev problem but rather pandas/sqlalchemy compatibility issue. I'm willing to go ahead since current fix solves compatibility as broadly as is possible.
This fixes
fetch_table
to be compatible across combinations of major versions of pandas (1.x, 2.x) and sqlalchemy (1.5.x and 2.x).tested combinations:
fixes #158