GoogleCloudPlatform / cloud-sql-nodejs-connector

A JavaScript library for connecting securely to your Cloud SQL instances
Apache License 2.0
67 stars 8 forks source link

Add support for lazy certificate refresh #285

Open enocom opened 8 months ago

enocom commented 8 months ago

When using the Connector in serverless environments where the CPU is throttled outside of request processing, the refresh cycle can break. This is a feature request to support a lazy refresh where the Connector retrieves a new certificate when the existing one is expired and a new connection attempt has started.

rstreefland commented 3 months ago

Hi @hessjcg, I'm just wondering if you'd be able to provide any indication of when this might be available please?

This is currently blocking us adopting the connector because we deploy lots of cloud functions and the cost to always allocate the CPU for all our functions would be prohibitive.

jackwotherspoon commented 3 months ago

Hi @hessjcg, I'm just wondering if you'd be able to provide any indication of when this might be available please?

This is currently blocking us adopting the connector because we deploy lots of cloud functions and the cost to always allocate the CPU for all our functions would be prohibitive.

Hi @rstreefland! We just implemented this change in Java and Python, Node is up next on our list. We should be getting to it in the coming week or two. Once we do merge it, we will cut an immediate release to unblock you 😄

rstreefland commented 3 months ago

@jackwotherspoon Amazing, thank you!

rstreefland commented 2 months ago

Hi @jackwotherspoon. Sorry to nag, but do you have an update on this one please? It's the last issue blocking us adopting this and IAM database authentication 🙏

jackwotherspoon commented 2 months ago

Hi @jackwotherspoon. Sorry to nag, but do you have an update on this one please? It's the last issue blocking us adopting this and IAM database authentication 🙏

@hessjcg will be the one working on this, he has had some higher priority things take precedence recently. He should be able to give a more accurate timeline of when he will get to this. Hopefully soon 😄

@rstreefland are you connecting over Public IP or Private IP to your Cloud SQL instance(s)?

If you are connecting over Private IP I would recommend just connecting directly to the Private IP address (or DNS name if using PSC), it will give you reduced latency compared to using the Connector and I can even show you how to still get IAM database authentication with a direct connection. Let me know if this would help in the meantime.

rstreefland commented 2 months ago

I can even show you how to still get IAM database authentication with a direct connection

@jackwotherspoon We are connecting using Public IP currently, but could switch to Private IP if necessary. We're using Cloud Run and 2nd Generation Cloud Functions to serve our workloads and my current understanding is that IAM database authentication is not currently possible without using the connector? If you're able to share an alternative solution, that would be incredibly useful please.

jackwotherspoon commented 2 months ago

We are connecting using Public IP currently, but could switch to Private IP if necessary.

For Public IP, we recommend using a Connector or the Cloud SQL Proxy for the security benefits so that is valid.

We're using Cloud Run and 2nd Generation Cloud Functions to serve our workloads and my current understanding is that IAM database authentication is not currently possible without using the connector? If you're able to share an alternative solution, that would be incredibly useful please.

While we haven't added these samples to our official Google Cloud docs (but we will be soon), for Private IP connections there is a way to do direct connections and configure IAM database authentication.

Here is an example for Postgres using the node-postgres package.

const {Pool} = require('pg');
const {GoogleAuth} = require('google-auth-library');

const auth = new GoogleAuth({
  scopes: ['https://www.googleapis.com/auth/sqlservice.login'],
});

const pool = new Pool({
  host: '<Cloud SQL instance private ip>',
  user: 'sa-name@project-id.iam',
  password: async () => {
    return await auth.getAccessToken();
  },
  database: 'mydb',
  ssl: {
    require: true,
    rejectUnauthorized: false, // required for self-signed certs
    // https://node-postgres.com/features/ssl#self-signed-cert
  }
});

Essentially IAM database authentication boils down to using an IAM Principal's OAuth2 token as the database password. Most popular database drivers/packages allow having the password be a callable/function, we can use this to get a fresh OAuth2 token hence accomplishing IAM database authentication.

Hope this helps 😄

rstreefland commented 2 months ago

@jackwotherspoon Thanks for this. With this method, would the OAuth2 access token expiry need to be handled by my application code or does the node-postgres pool handle this?

jackwotherspoon commented 2 months ago

would the OAuth2 access token expiry need to be handled by my application code or does the node-postgres pool handle this?

@rstreefland Great question!

I believe the pool will hopefully recycle the connection if the token is expired but I'm not 100% sure if it will.

You could guarantee a fresh token with a 3600s lifetime by creating a new GoogleAuth object inside the password function (not ideal but works).

This coupled with setting the maxLifetimeSeconds for the Pool to like 45 minutes should guarantee the token is valid for the lifetime of the connection. https://github.com/brianc/node-postgres/pull/2698

const pool = new Pool({
  host: '<Cloud SQL instance private ip>',
  user: 'sa-name@project-id.iam',
  password: async () => {
    const auth = new GoogleAuth({
        scopes: ['https://www.googleapis.com/auth/sqlservice.login'],
    });
    return await auth.getAccessToken();
  },
  database: 'mydb',
  ssl: {
    require: true,
    rejectUnauthorized: false, // required for self-signed certs
    // https://node-postgres.com/features/ssl#self-signed-cert
  },
  maxLifetimeSeconds: 2700  // 45 minutes
});
hessjcg commented 2 months ago

Hello all. Unfortunately this feature will not be available in the near future. The implementation is not as straightforward as we anticipated.

The connector configures the database driver options to use the connector to create new sockets to the database. Currently, the Postgres and Mysql database drivers allow you to configure a function that returns a net.Socket. To create the socket, the connector needs to finish getting the connection info about the database instance. This is an asynchronous operation.

To summarize the code as-is today:

const options = {
  stream: function() { 
    const connectionInfo = Connector.getCachedConnectionInfo(instanceName) 
    return net.connect(connectionInfo.ipAddress)
  }
}

In order to make lazy refresh work, the connector would need to asynchronously check if the connection info is valid, loading it if needed, and then instantiate the net.TlsSocket. That would turn this into an

const options = {
  stream: async function() { // Not Allowed!
    const connectionInfo = await Connector.getOrLoadConnectionInfo(instanceName) 
    return net.connect(connectionInfo.ipAddress)
  }
}

Unfortunately the Postgres and MySQL database drivers can't use an async function here. Also, we don't want to implement a busy-wait loop for the connectionInfo promise, as it would be anti-idiomatic.

We are open to suggestions on how to implement this.