Closed dude0001 closed 1 year ago
I found an issue including this same error in localstack https://github.com/localstack/localstack/issues/6131. The author of the issue said it was suggested to him "Also talked to the Localstack team through the Pro channel and they've also mentioned that a temporarily fix could be removing the following 3 lines from the redshift connector library: https://github.com/aws/amazon-redshift-python-driver/blob/3906211974c00d2d36a470eb0536b8f96681c158/redshift_connector/core.py#L534-L536".
I remove those 3 lines from my local copy of the package, and it resolves the issue here as well. I'm not sure what this means or how I can work around this. Is there something that needs setup in the pmr redshift container to permanently resolve this? The localstack issue was resolved as fixed in a later version, but I am unclear what the fix was. I was hoping a similar fix could be implement in PMR and that it might be obvious to someone what the issue is.
Right now, we technically only expose direct redshift access through sqlalchemy, through https://github.com/sqlalchemy-redshift/sqlalchemy-redshift. Once you're attempting to connect through redshift_connector
, you're leaving behind PMR's internals and connecting directly to the container yourself.
The underlying container we start up is a postgres container (because there's no container equivalent of redshift), so my assumption would be that redshift_connector
is sending extra parameters that postgres doesn't understand. And looking at the implementation at https://github.com/aws/amazon-redshift-python-driver/blob/master/redshift_connector/core.py, the body of that function is like 400 lines long without any externally visible state to patch
out init_params
. So you might be out of luck using redshift_connector here unless they're willing to alter their connection mechanism, because it seems to not be compatible with the underlying postgres container we use.
The easiest way to use this library right now, would be using https://github.com/sqlalchemy-redshift/sqlalchemy-redshift with psyopg2 if at all possible. That's what we do, and it works very well.
You'd lose the extra pandas behaviors they add on top, but if you look at their impl in their cursor.py, it's not really doing much. I'm reasonably certain you can use pd.read_sql
and whatnot directly; using sqlalchemy engines instead to the same effect.
I wasn't seeing an option to patch either. :( But thank you for affirming that. Unfortunately, we have a desire to not use sqlalchemy with this project because of performance issues. We aren't using Pandas either (mostly because of the huge package size) and wrote our own adapter around redshift_connector to simply return query results as a dictionary. Hmmm I'm going to keep thinking on this for a bit. I might have a few more questions before closing this issue out.
i'm surprised that you'd see perf issues with sqlalchemy engine/connection.execute
calls to tuples. particularly given that i'd expect psycopg2 (cextension) to be drastically faster than redshift_connector (pure python)
I'm going to close this, if you dont mind. There's nothing for us to patch in redshift_connector, and I'm very convinced that sqlalchemy with psycopg2 should be faster anyways, so that's really the best we can do (where postgres is still the underlying container), i think.
Describe the bug When trying to connect to pmr redshift container DB with redshift_connector , I get the following error.
ProgrammingError({'S': 'FATAL', 'V': 'FATAL', 'C': '42704', 'M': 'unrecognized configuration parameter "client_protocol_version"', 'F': 'guc.c', 'L': '5858', 'R': 'set_config_option'})
Environment
To Reproduce Using redshift_connector when
redshift_connector.connect
below is called from a test I get the error.Expected behavior I would like to be able to connect to the Redshift fixture DB with redshift_connector.
Actual Behavior
Additional context