MagicStack / asyncpg

A fast PostgreSQL Database Client Library for Python/asyncio.
Apache License 2.0
6.99k stars 404 forks source link

TimeoutError on multiple requests #1076

Open petritavd opened 1 year ago

petritavd commented 1 year ago

We're using FastAPI ,and our current code for setting up the db connection is:

connection_arguments: Dict[str, Any] = {"server_settings": {"jit": "off"}}

ENUM_TYPES = (
    ("public", "sometype"), ...
)

async def register_enum_types(conn) -> None: 
    for enum_class, typename in ENUM_TYPES:
        await conn.set_builtin_type_codec(
            typename, schema=enum_class, codec_name="text"
        )

engine = create_async_engine(
    settings.uri,
    echo=False,
    connect_args=connection_arguments,
    pool_size=200,
    max_overflow=10,
)

async_sessionmaker = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)

@event.listens_for(engine.sync_engine, "connect")
def on_connect(dbapi_connection, connection_record):
    dbapi_connection.run_async(register_enum_types)

We added the register_enum_types as we were thinking that the query: SELECT t.oid, t.typelem AS elemtype, t.typtype AS kind FROM pg_catalog.pg_type AS t WHERE t.oid = $1

which we saw a lot in AWS RDS monitoring is causing the issue, but now we're not sure what is it anymore.

In logs when it happens we see:

File "/path/to/file/python3.10/site-packages/asyncpg/connect_utils.py", line 780, in _connect_addr raise asyncio.TimeoutError asyncio.exceptions.TimeoutError

and after some time, it gets back to normal.

ilirosmanaj commented 1 year ago

@elprans, any ideas on the above? It's happening pretty consistently and we've tried basically everything to get this sorted. Some input would be very welcome.

elprans commented 1 year ago

This seems like a connection timeout, not a query timeout, so I'd investigate the network configuration. How are you hosting your app?

Also, why do you need to register the codec alias for the enum types? Asyncpg already decodes them as text by default.

ilirosmanaj commented 1 year ago

The app is a FastAPI app running in EC2 - we're using uvicorn and start about 10 workers. The postgres instance is in AWS RDS (instance type db.m5.2xlarge), see instance details below: image

What we notice is that the "Average active sessions" sometimes blows up even with not that high usage and causes these timeouts.

image

The reason for registering was because we sometimes notice spikes of the following query (even though we have jitt set to off) and as part off some investigations this was a proposed solution (but didn't help). The query:

SELECT t.oid, t.typelem AS elemtype, t.typtype AS kind FROM pg_catalog.pg_type AS t WHERE t.oid = $1.

I am not sure if the connection pool is being drained here but it's highly unlikely in my opinion.

Let me know your thoughts @elprans ✋

ilirosmanaj commented 1 year ago

Any ideas @elprans? Sorry for bugging you but it's quite an interesting thing.

elprans commented 1 year ago

Could it be that you've restarted your app at that time and all workers had their connection pools populated?

elprans commented 1 year ago

The reason for registering was because we sometimes notice spikes of the following query (even though we > have jitt set to off) and as part off some investigations this was a proposed solution (but didn't help). The query:

SELECT t.oid, t.typelem AS elemtype, t.typtype AS kind FROM pg_catalog.pg_type AS t WHERE t.oid = $1.

This happens because you are registering codecs, the query is executed by set_type_codec/set_builtin_type_codec.

zagortenej024 commented 6 months ago

Hey @ilirosmanaj, did you resolve this by any chance? We're having the exact same issues