MagicStack / asyncpg

A fast PostgreSQL Database Client Library for Python/asyncio.
Apache License 2.0
6.86k stars 396 forks source link

Extremely long delay grabbing type info for string array (and likely other types) on CockroachDB #1158

Open JamesHutchison opened 3 months ago

JamesHutchison commented 3 months ago

I was having atrocious and unacceptable delays in my production environment that I wasn't seeing locally, using CockroachDB cloud. Found the cause was the introspection of types. I'm using the latest version of the CockroachDB drivers.

recursion-statements

I wrote this hack to work around the issue. It caches the result in local memory and also caches it to redis so that new instances don't see it. You can change the key for the redis cache using an environment variable so that new versions aren't locked to old values.

If someone wants to turn it into part of the product, please be my guest. I won't have time for it. In the mean time. here's the hack that does monkey patching:

introspection_result_cache: dict[tuple[str, int, str], Any] = {}

orig_introspection_types = asyncpg.Connection._introspect_types

INTROSPECTION_KEY = os.environ.get(
    "ASYNCPG_INTROSPECTION_CACHE_KEY", "ASYNCPG_INTROSPECTION_CACHE_KEY"
)

introspection_lock = asyncio.Lock()

class FauxResult:
    _binary_fields = ("kind", "elemdelim")
    column_order = [
        "oid",
        "ns",
        "name",
        "kind",
        "basetype",
        "elemtype",
        "elemdelim",
        "range_subtype",
        "attrtypoids",
        "attrnames",
        "depth",
        "basetype_name",
        "elemtype_name",
        "range_subtype_name",
    ]

    def __init__(self, row=None, data: dict | None = None) -> None:
        if row:
            self.data = dict(row)
        else:
            assert data
            self.data = data

    def __getattr__(self, name: str) -> Any:
        return self.data[name]

    def __getitem__(self, idx_or_column_name: int | str) -> Any:
        if isinstance(idx_or_column_name, int):
            return self.data[self.column_order[idx_or_column_name]]
        return self.data[idx_or_column_name]

    def for_serialization(self) -> dict:
        result = copy.copy(self.data)
        for field in self._binary_fields:
            if (value := self.data.get(field)) is not None:
                result[field] = value.decode()
        return result

    @classmethod
    def from_serialization(cls, data: dict) -> Self:
        for field in cls._binary_fields:
            if (value := data.get(field)) is not None:
                data[field] = value.encode()
        return cls(data=data)

class FauxPreparedStatementState:
    def __init__(self, name) -> None:
        self.name = name

async def to_redis_cache(
    host: str, port: int, database: str, inspection_types: tuple[list, Any]
) -> None:
    pss = FauxPreparedStatementState(inspection_types[1].name)
    results = [FauxResult(row) for row in inspection_types[0]]
    await redis_client().set(
        INTROSPECTION_KEY + f"-{host}-{port}-{database}",
        orjson.dumps([[result.for_serialization() for result in results], pss.name]),
    )

async def from_redis_cache(host: str, port: int, database: str) -> tuple[list, Any] | None:
    data = await redis_client().get(INTROSPECTION_KEY + f"-{host}-{port}-{database}")
    if data is None:
        return None
    results, pss_name = orjson.loads(data)
    pss = FauxPreparedStatementState(pss_name)
    return [FauxResult.from_serialization(row) for row in results], pss

def apply_introspection_caching():

    async def new_introspect_types(self, *args, **kwargs) -> Any:
        host: str
        port: int
        database: str
        host, port = self._addr
        database = self._params.database
        if (cached_val := introspection_result_cache.get((host, port, database))) is not None:
            return cached_val
        async with introspection_lock:
            redis_cached_value = await from_redis_cache(host, port, database)
        if redis_cached_value is not None:
            introspection_result_cache[host, port, database] = redis_cached_value
            return redis_cached_value
        result = await orig_introspection_types(self, *args, **kwargs)
        await to_redis_cache(host, port, database, result)
        return result

    asyncpg.Connection._introspect_types = new_introspect_types
elprans commented 1 month ago

The introspection query isn't that complex and typically runs in under a millisecond on regular Postgres. This is something you might want to raise with Cockroach Labs people.

tamis-laan commented 1 month ago

@elprans

I also get a 7 seconds delay on each connection everytime it's opened. This is really annoying given I use a connection pool. This only happends with cockroachdb it doesn't happen with postgresql.

On the other hand when using sql alchemy this problem does not occur. So I think the problem is on the side of the asyncpg library.

JamesHutchison commented 1 month ago

I reached out to CockroachDB about this.

inata4 commented 1 month ago

@JamesHutchison could you please reach out to Cockroach DB support via our ticketing system: https://support.cockroachlabs.com/hc/en-us This way we can provide you with the best support. It looks like you emailed another organization within CRL. Thank you!

JamesHutchison commented 1 month ago

Created ticket

mohan-crdb commented 1 month ago

@JamesHutchison - I can see the ticket has been created but we are unsure about the organization that you are using, could you please share the organization details in the existing ticket or create the ticket with the correct organization?

tamis-laan commented 4 weeks ago

Any new information on this? I still get delays everywhere in my app.

JamesHutchison commented 4 weeks ago

They are aware of the issue

https://github.com/cockroachdb/cockroach/issues/113292

My recommendation is to use my workaround in the meantime

tamis-laan commented 3 weeks ago

@JamesHutchison I can't I get a 7 seconds delay on each connection opened to cockroachdb not just on grabbing type info. Not sure if we have the same problem, but they are probably related.

tamis-laan commented 3 weeks ago

So apparently this issue might be related to asyncpg introspection queries: https://github.com/cockroachdb/cockroach/issues/128774 https://github.com/MagicStack/asyncpg/blob/85d7eed40637e7cad73a44ed2439ffeb2a8dc1c2/asyncpg/introspection.py#L14-L88

Is there a way to rollback to a version without this issue until the problem is fixed?

JamesHutchison commented 3 weeks ago

If you're using redis or really anything that could store the serialized result of the query, my hack works pretty well.

tamis-laan commented 6 days ago

@JamesHutchison Not using Redis I'm afraid, hope this issue gets resolved soon. It surprises me that there are not more people with this issue.

JamesHutchison commented 6 days ago

Just save it to a file or something. Whatever (non-DB persistence you have available.