schireson / pytest-mock-resources

Pytest Fixtures that let you actually test against external resource (Postgres, Mongo, Redshift...) dependent code.
https://pytest-mock-resources.readthedocs.io/en/latest/quickstart.html
MIT License
179 stars 19 forks source link

Error opening connection from redshift_connector #161

Closed dude0001 closed 1 year ago

dude0001 commented 2 years ago

Describe the bug When trying to connect to pmr redshift container DB with redshift_connector , I get the following error.

ProgrammingError({'S': 'FATAL', 'V': 'FATAL', 'C': '42704', 'M': 'unrecognized configuration parameter "client_protocol_version"', 'F': 'guc.c', 'L': '5858', 'R': 'set_config_option'})

Environment

To Reproduce Using redshift_connector when redshift_connector.connect below is called from a test I get the error.

import pytest
from pytest_mock_resources import create_redshift_fixture
import redshift_connector

redshift_fixture = create_redshift_fixture()

@pytest.fixture()
def connection_fixture(monkeypatch, redshift_fixture):
    def get_connection():
        credentials = redshift_fixture.pmr_credentials
        connection = redshift_connector.connect(
            application_name="My App (Tests)",
            host=credentials.host,
            port=credentials.port,
            ssl=False,
            database=credentials.database,
            user=credentials.username,
            password=credentials.password,
        )
    return get_connection

Expected behavior I would like to be able to connect to the Redshift fixture DB with redshift_connector.

Actual Behavior

/home/mlambert/venvs/myapp/lib/python3.9/site-packages/requests/sessions.py:600: in get
    return self.request("GET", url, **kwargs)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/testclient.py:476: in request
    return super().request(
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/requests/sessions.py:587: in request
    resp = self.send(prep, **send_kwargs)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/requests/sessions.py:701: in send
    r = adapter.send(request, **kwargs)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/testclient.py:270: in send
    raise exc
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/testclient.py:267: in send
    portal.call(self.app, scope, receive, send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/anyio/from_thread.py:283: in call
    return cast(T_Retval, self.start_task_soon(func, *args).result())
/home/mlambert/.pyenv/versions/3.9.12/lib/python3.9/concurrent/futures/_base.py:446: in result
    return self.__get_result()
/home/mlambert/.pyenv/versions/3.9.12/lib/python3.9/concurrent/futures/_base.py:391: in __get_result
    raise self._exception
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/anyio/from_thread.py:219: in _call_func
    retval = await retval
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/applications.py:269: in __call__
    await super().__call__(scope, receive, send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/applications.py:124: in __call__
    await self.middleware_stack(scope, receive, send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/middleware/errors.py:184: in __call__
    raise exc
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/middleware/errors.py:162: in __call__
    await self.app(scope, receive, _send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/exceptions.py:93: in __call__
    raise exc
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/exceptions.py:82: in __call__
    await self.app(scope, receive, sender)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py:21: in __call__
    raise e
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py:18: in __call__
    await self.app(scope, receive, send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/routing.py:670: in __call__
    await route.handle(scope, receive, send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/routing.py:266: in handle
    await self.app(scope, receive, send)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/starlette/routing.py:65: in app
    response = await func(request)
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/routing.py:217: in app
    solved_result = await solve_dependencies(
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/dependencies/utils.py:525: in solve_dependencies
    solved = await solve_generator(
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/dependencies/utils.py:449: in solve_generator
    return await stack.enter_async_context(cm)
/home/mlambert/.pyenv/versions/3.9.12/lib/python3.9/contextlib.py:575: in enter_async_context
    result = await _cm_type.__aenter__(cm)
/home/mlambert/.pyenv/versions/3.9.12/lib/python3.9/contextlib.py:181: in __aenter__
    return await self.gen.__anext__()
/home/mlambert/venvs/myapp/lib/python3.9/site-packages/fastapi/concurrency.py:30: in contextmanager_in_threadpool
    raise e
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <contextlib._GeneratorContextManager object at 0x7f1debbb5610>
typ = <class 'redshift_connector.error.ProgrammingError'>
value = ProgrammingError({'S': 'FATAL', 'V': 'FATAL', 'C': '42704', 'M': 'unrecognized configuration parameter "client_protocol_version"', 'F': 'guc.c', 'L': '5858', 'R': 'set_config_option'})
traceback = None

    def __exit__(self, typ, value, traceback):
        if typ is None:
            try:
                next(self.gen)
            except StopIteration:
                return False
            else:
                raise RuntimeError("generator didn't stop")
        else:
            if value is None:
                # Need to force instantiation so we can reliably
                # tell if we get the same exception back
                value = typ()
            try:
>               self.gen.throw(typ, value, traceback)
E               redshift_connector.error.ProgrammingError: {'S': 'FATAL', 'V': 'FATAL', 'C': '42704', 'M': 'unrecognized configuration parameter "client_protocol_version"', 'F': 'guc.c', 'L': '5858', 'R': 'set_config_option'}

Additional context

dude0001 commented 2 years ago

I found an issue including this same error in localstack https://github.com/localstack/localstack/issues/6131. The author of the issue said it was suggested to him "Also talked to the Localstack team through the Pro channel and they've also mentioned that a temporarily fix could be removing the following 3 lines from the redshift connector library: https://github.com/aws/amazon-redshift-python-driver/blob/3906211974c00d2d36a470eb0536b8f96681c158/redshift_connector/core.py#L534-L536".

I remove those 3 lines from my local copy of the package, and it resolves the issue here as well. I'm not sure what this means or how I can work around this. Is there something that needs setup in the pmr redshift container to permanently resolve this? The localstack issue was resolved as fixed in a later version, but I am unclear what the fix was. I was hoping a similar fix could be implement in PMR and that it might be obvious to someone what the issue is.

image

DanCardin commented 2 years ago

Right now, we technically only expose direct redshift access through sqlalchemy, through https://github.com/sqlalchemy-redshift/sqlalchemy-redshift. Once you're attempting to connect through redshift_connector, you're leaving behind PMR's internals and connecting directly to the container yourself.

The underlying container we start up is a postgres container (because there's no container equivalent of redshift), so my assumption would be that redshift_connector is sending extra parameters that postgres doesn't understand. And looking at the implementation at https://github.com/aws/amazon-redshift-python-driver/blob/master/redshift_connector/core.py, the body of that function is like 400 lines long without any externally visible state to patch out init_params. So you might be out of luck using redshift_connector here unless they're willing to alter their connection mechanism, because it seems to not be compatible with the underlying postgres container we use.

The easiest way to use this library right now, would be using https://github.com/sqlalchemy-redshift/sqlalchemy-redshift with psyopg2 if at all possible. That's what we do, and it works very well.

You'd lose the extra pandas behaviors they add on top, but if you look at their impl in their cursor.py, it's not really doing much. I'm reasonably certain you can use pd.read_sql and whatnot directly; using sqlalchemy engines instead to the same effect.

dude0001 commented 2 years ago

I wasn't seeing an option to patch either. :( But thank you for affirming that. Unfortunately, we have a desire to not use sqlalchemy with this project because of performance issues. We aren't using Pandas either (mostly because of the huge package size) and wrote our own adapter around redshift_connector to simply return query results as a dictionary. Hmmm I'm going to keep thinking on this for a bit. I might have a few more questions before closing this issue out.

DanCardin commented 2 years ago

i'm surprised that you'd see perf issues with sqlalchemy engine/connection.execute calls to tuples. particularly given that i'd expect psycopg2 (cextension) to be drastically faster than redshift_connector (pure python)

DanCardin commented 1 year ago

I'm going to close this, if you dont mind. There's nothing for us to patch in redshift_connector, and I'm very convinced that sqlalchemy with psycopg2 should be faster anyways, so that's really the best we can do (where postgres is still the underlying container), i think.