vapor / fluent

Vapor ORM (queries, models, and relations) for NoSQL and SQL databases
https://docs.vapor.codes/4.0/fluent/overview/
MIT License
1.3k stars 171 forks source link

Database connection timing out when running fluent queries in parallel #771

Closed henrypratt closed 7 months ago

henrypratt commented 7 months ago

I am using TaskGroup to attempt to run fluent queries in parallel, and I keep getting the following error:

[ ERROR ] Connection request (ID 6471 timed out. This might indicate a connection deadlock in your application. If you have long-running requests, consider increasing your connection timeout. [database-id: psql]

Below is a code sample:

        await withTaskGroup(of: Void.self) { group in
            for symbol in symbols {
                group.addTask {
                    do {
                        if let stock = try await Stock.query(on: db)
                            .filter(\.$symbol == symbol)
                            .first() {

                            print(stock.symbol)
                        } else {
                            print("Error")
                        }
                    } catch {
                        fatalError("Caught error!")
                    }
                }
            }

            await group.waitForAll()
        }

I am not having this issue with writes, just queries.

gwynne commented 7 months ago

This is a known issue caused by issuing a large number of queries in parallel; it will be solved when the new PostgresNIO connection pool module has been adapted for use by all of Fluent. In the meantime, the access pattern you're using is extremely inefficient (especially given that parallelized queries are very rarely advantageous in practice unless your configuration is very heavily tuned)- you can achieve the same effect considerably faster (and with a lot less load on the database) this way:

// It is assumed here that the `symbol` field is a String or other simple type
let stocks: [Stock] = try await Stock
    .query(on: db)
    .filter(\.$symbol ~~ symbols)
    .all()
gwynne commented 7 months ago

Duplicate of #716

henrypratt commented 7 months ago

Thank you for the fast response! I do not have a lot of experience with databases, so I was unaware that this was inefficient. I need to create a dictionary of UUID's and some other data. So it sound like it would be better to query all stocks, then add their ID's in to a dictionary and keep that in memory?

Also, do you have any recommended reading about why concurrent queries are inefficient and why this pattern should be avoided? Thanks!