Closed GusPrice closed 5 months ago
A few questions:
Thanks for the questions!
The specific error I saw was:
dial error (failed to get instance: Refresh error: failed to get instance metadata
(connection name = "project:region:instance"): googleapi: Error 429: Quota exceeded for quota metric
'Connect Queries' and limit 'Connect Queries per minute per user per region' of
service 'sqladmin.googleapis.com' for consumer 'project_number:123456`
Currently creating one dialer per tenant/pgxpool. Would it be reasonable to share the dialer across pools, that seems like it might be the solution here if it stores the certs by instance and not DSN (since our DSNs are different)
Sure! Here's a slightly sanitzed verison of the code we use to instantiate our pool:
func openDB(ctx context.Context, dsn, connName string, opts ...PgxPoolOption) (*sql.DB, error) {
// create a pgxpool config and apply options and config
cfg, err := pgxpool.ParseConfig(dsn)
if err != nil {
return nil, err
}
for _, opt := range opts {
opt(cfg)
}
// Create a new dialer with option for private IP
d, err := cloudsqlconn.NewDialer(ctx, cloudsqlconn.WithDefaultDialOptions(cloudsqlconn.WithPrivateIP()))
if err != nil {
return nil, err
}
// Tell the driver to use the Cloud SQL Go Connector to create connections
cfg.ConnConfig.DialFunc = func(ctx context.Context, _ string, instance string) (net.Conn, error) {
return d.Dial(ctx, connName)
}
// instantiate the pgxpool from the config
connPool, err := pgxpool.NewWithConfig(ctx, cfg)
if err != nil {
return nil, errors.Wrap(err, "call to pgx.NewConnPool failed")
}
// retrieve a *sql.DB from the pgxpool
db := stdlib.OpenDBFromPool(connPool)
return db, nil
}
This function get's called in a for loop across our tenants and makes a map:
for _, t := range tenants {
connectionMap[t.ID] = openDB(ctx, t.DSN, connName)
}
we pass use this connection map to select an appropriate connection based on tenant being accessed.
Currently creating one dialer per tenant/pgxpool. Would it be reasonable to share the dialer across pools, that seems like it might be the solution here if it stores the certs by instance and not DSN (since our DSNs are different)
Yes -- that's your best option here. The dialer caches client certificates on a per instance basis, not a per database basis. So you'll immediately solve your problem this way. You can continue to configure your pgxpools to call through to d.Dial
to keep the pool per tenant design.
Ah, wow, love when it's just a super easy solution. I will give this new approach a whirl in the next couple days and come back to confirm success. Thank you!
Let me know how that goes. We can explore additional options if you hit quota issues again.
Sorry two more tangentially related questions:
main
function or similar. When you close the dialer, it stops accepting new connections and stops the background certificate refresh goroutines.Live with the shared dialer and things seem to be going much more smoothly. Down to ~50 requests during service deploy (across a variety of services) from what I think peaked a bit over 5000 before.
Thanks again for the help!
Happy to hear it! :clap:
Question
When I first implemented cloudsqlconn I ran into API rate limiting issues and had to rollback my changes. I am looking for advice or tips on how to best handle the following set up:
Currently, on application start, we create a new pool for each tenant that connects specifically to its database on the instance using its own specific db user and password. This means that we have 150 5 (2 maybe depending on if the certs are cached) requests minimum (750 requests) when we do a new deployment. We have additional requests around the same time for one time actions on deployment, like database migrations. So we quickly blew through the 1000 requests/min api limit.
I understand that the connector has caching for the certificate information etc, but is there any way to share that across connector instances/pgxpools? I'm struggling to see how we could use this tool in our current architecture or how to change our architecture to a better configuration.
Code
No response
Additional Details
I left out any code because our usage is more or less a direct copy paste of the examples in this repo.