Closed Jesse-jApps closed 3 years ago
Hi @Jesse-jApps, I couldn't reproduce that behaviour. Here's my test using SQLAlchemy==1.4.20
and pg8000==1.19.5
:
from sqlalchemy import create_engine
engine = create_engine("postgresql+pg8000://postgres:cpsnow@localhost/test")
connection = engine.connect()
db_con = connection.connection
with db_con.cursor() as cur:
cur.execute("select typname from pg_type")
for row in cur.fetchall():
print(row[0])
Are you able to provide a minimum reproducible example of it not working?
Also, IIRC the DBAPI specification doesn't include context management methods, so I'd argue that geopandas shouldn't be relying on anything outside the specification.
Hi @tlocke, sorry for the sparse info. I will try to provide more details. I am using the alpha version of google.cloud.sql.connector to create a connection from within a jupyterlab notebook in the cloud. To make it work, I use
import pg8000
import sqlalchemy
from google.cloud.sql.connector import connector
def getconn() -> pg8000.dbapi.Connection:
conn: pg8000.dbapi.Connection = connector.connect(
'host',
"pg8000",
user='user',
password='password',
db='database'
)
return conn
engine = sqlalchemy.create_engine(
"postgresql+pg8000://",
creator=getconn,
)
Therefore I will get the dbapi cursor.
I haven't tried using the legacy connection yet, but since it will be deprecated soon, i guess it should be the dbapi anyway.
I am quit surprised, that the DBAPI 2.0 specifications do not define context management methods. Since there is a close()
-method, I thought that would be a logical step.
But in that case you are right, that the changes should be made in geopandas, which by the way expects the cursor to have a copy_expert()
-method in order to do something like e.g.
gdf = gpd.read_file('gs://bucket/file.shp')
gdf.to_postgis('table_name', engine)
Additionally, while geopandas uses a StringIO-Buffer to execute the COPY
-Statement, pg8000 dbapi-cursor expects a BytesIO-Buffer. I am not sure how it should be or which is the correct way to do it.
In def execute(...)
it says "For a COPY FROM the parameter must be a readable file-like object, and for COPY TO it must be writable". At least the information and an assert should be added that the file-like object should be bytes?
Therefore I will get the dbapi cursor.
That makes sense. The dbapi cursor strictly follows the DBAPI spec and so doesn't have a context manager.
I haven't tried using the legacy connection yet, but since it will be deprecated soon, i guess it should be the dbapi anyway.
Yes, the legacy API will be deprecated at some point.
I am quit surprised, that the DBAPI 2.0 specifications do not define context management methods. Since there is a close()-method, I thought that would be a logical step.
I think the reason is that the DBAPI 2 spec is over 20 years old, and maybe Python didn't even have context managers then? I've got a vague plan to write a rough DBAPI 3 spec to see what people think. I've never written a spec before, so maybe I'd be biting off more than I can chew :-)
Additionally, while geopandas uses a StringIO-Buffer to execute the COPY-Statement, pg8000 dbapi-cursor expects a BytesIO- Buffer. I am not sure how it should be or which is the correct way to do it.
From the PostgreSQL docs on COPY, it uses a bytes stream with the default encoding of the CLIENT_ENCODING PostgreSQL setting. It provides an ENCODING parameter to COPY if you want to override that. You might say, why doesn't pg8000 handle all that behind the scenes and use a StringIO buffer? My answer is that COPY also supports a binary format, so pg8000 has to work for that too. Actually, we could do something a bit more friendly here, so I've added an issue for this https://github.com/tlocke/pg8000/issues/76.
So with the latest pg8000 release 1.20.0 you should be able to use a StringIO
buffer when doing a COPY
. You can still use a BytesIO if you like though.
@Jesse-jApps you can monkey patch your application with something like the following:
class Cursor(pg8000.dbapi.Cursor):
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.close();
pg8000.dbapi.Cursor = Cursor
@tlocke looks like from https://github.com/tlocke/pg8000/issues/75#issuecomment-870688951 you would not be interested in a PR to add this and plan to stick with DB-API 'purity'?
Hi @jimdigriz, pg8000 has two APIs, pg8000.native
and pg8000.dbapi
. The idea is that pg8000.dbapi
strictly follows the DBAPI 2 specification, and the aim of pg8000.native
is to provide a convenient API with all the mod cons (such as a context manager for the connection object).
I'll close this for now, but please do re-open if needed.
When trying to use pg8000 with geopandas I get the following error.
At the moment the class
Cursor
has not__enter__
. Would be nice to add it.