GoogleCloudPlatform / cloud-sql-nodejs-connector

A JavaScript library for connecting securely to your Cloud SQL instances
Apache License 2.0
71 stars 8 forks source link

password error and timeout error on the server #222

Closed djMax closed 1 year ago

djMax commented 1 year ago

Bug Description

I have two apps that are almost exactly the same (same infra, same package versions, same db config code). They are connecting to two different CloudSQL Postgres instances but same PG versions and configurations as far as I can tell in GCP console.

pg-pool is 3.6.1 pg is 8.11.3 cloud-sql-nodejs-connector is 0.5.1

The connection code is:

    const connector = new Connector();
    const clientOpts = await connector.getOptions({
      instanceConnectionName: host,
      ipType: IpAddressTypes.PRIVATE,
      authType: AuthTypes.IAM,
    });

    return new Pool({
      max: maxPoolSize || 5,
      database,
      user,
    });

I've verified db and user are correct in both cases. On the client side, I immediately get this error:

Uncaught Error: SASL: SCRAM-SERVER-FIRST-MESSAGE: client password must be a string
    at /nodejs/app/node_modules/pg-pool/index.js:45:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async PostgresDriver.acquireConnection (/nodejs/app/node_modules/kysely/dist/cjs/dialect/postgres/postgres-driver.js:21:24)
    at async RuntimeDriver.acquireConnection (/nodejs/app/node_modules/kysely/dist/cjs/driver/runtime-driver.js:46:28)
    at async DefaultConnectionProvider.provideConnection (/nodejs/app/node_modules/kysely/dist/cjs/driver/default-connection-provider.js:10:28)
    at async DefaultQueryExecutor.executeQuery (/nodejs/app/node_modules/kysely/dist/cjs/query-executor/query-executor-base.js:36:16)
    at async SelectQueryBuilderImpl.execute (/nodejs/app/node_modules/kysely/dist/cjs/query-builder/select-query-builder.js:295:24)
    at async REPL1:1:33

In CloudSQL logs I see this after the process ends:

2023-09-26 22:03:07.306 UTC [3124945]: [1-1] db=mydb,user=myaimaccount@myproject.iam FATAL:  canceling authentication due to timeout

Any ideas? I did cross-coding and verified that trying to connect to the "bad" db with the "good" code also fails. So I'm suspecting something is wrong the the CloudSQL db but I don't know what.

djMax commented 1 year ago

I have a feeling this is a mismatch in the user account on the db where it's missing the domain name. SO I think the bug MIGHT be just that the error is really really useless.

edosrecki commented 1 year ago

@djMax, you are not passing clientOpts when creating the Pool.

jackwotherspoon commented 1 year ago

Hi @djMax, I believe the issue is as @edosrecki mentioned, it seems you are not passing the clientOpts to the call to new Pool().

Can you try updating your code to the following:

const pool = new Pool({
  ...clientOpts,
  user: 'test-sa@test-project.iam',
  database: 'db-name',
  max: 5
});

Let me know if the change solves the problem, this snippet was taken from our README

djMax commented 1 year ago

Sorry - that was a bad cut paste as I was trying to remove extraneous detail. It was being passed. BUT, the problem is the service account that terraform had mapped in the "bad case" was wrong - it was missing the @test-project.iam bit. But for whatever reason, it wasn't saying "login failed for that SA" it was saying this other weird thing, and then timing out on the SQL side.

mitchellwarr commented 7 months ago

This issue is really good to find, we just encountered the same thing on our project. It would save on a lot of confusion if the error mentioned that the iam user failed first, before falling back on user/password connection