GoogleCloudPlatform / cloud-sql-proxy

A utility for connecting securely to your Cloud SQL instances
Apache License 2.0
1.28k stars 350 forks source link

Cannot connect to a psql instance with private service connect #2337

Open nicolasThal opened 4 days ago

nicolasThal commented 4 days ago

Bug Description

Hello,

I have created a database that uses private service connect. I have set up the forwarding rules so that from different project, I am able to connect to the database directly.

To be able to connect to the database directly from my local machine, I wanted to set up a cloud-sql-proxy.

I have a create a instance in the same project as the database, and I launch the command : cloud-sql-proxy PROJECT_ID:REGION:INSTANCE_NAME --psc

The log output is the following :

2024/11/25 14:15:19 The proxy has started successfully and is ready for new connections!
2024/11/25 14:15:27 [PROJECT_ID:REGION:INSTANCE_NAME] accepted connection from 127.0.0.1:59958
2024/11/25 14:15:57 [PROJECT_ID:REGION:INSTANCE_NAME] failed to connect to instance: Dial error: failed to dial (connection name = "PROJECT_ID:REGION:INSTANCE_NAME"): dial tcp 10.250.250.5:3307: i/o timeout

The ip resolution seems ok, but the port used is not right.

How can I configure the proxy to connect on the 5432 port ?

Thank you for your help, Nicolas

Example code (or command)

cloud-sql-proxy PROJECT_ID:REGION:INSTANCE_NAME --psc

Stacktrace

2024/11/25 14:15:19 The proxy has started successfully and is ready for new connections!
2024/11/25 14:15:27 [PROJECT_ID:REGION:INSTANCE_NAME] accepted connection from 127.0.0.1:59958
2024/11/25 14:15:57 [PROJECT_ID:REGION:INSTANCE_NAME] failed to connect to instance: Dial error: failed to dial (connection name = "PROJECT_ID:REGION:INSTANCE_NAME"): dial tcp 10.250.250.5:3307: i/o timeout

Steps to reproduce?

  1. Create a cloudsql psql instance with psc
  2. Create a vm instance with cloud-sql-proxy
  3. Launch cloud-sql-proxy
  4. Try to connect to the database through the proxy ...

Environment

  1. OS type and version: Debian 6.1.112-1 (2024-09-30) x86_64 GNU/Linux
  2. Cloud SQL Proxy version (./cloud-sql-proxy --version): 2.6.1and 2.14.1
  3. Proxy invocation command (for example, ./cloud-sql-proxy --port 5432 INSTANCE_CONNECTION_NAME): ./cloud-sql-proxy PROJECT_ID:REGION:INSTANCE_NAME --psc

Additional Details

No response

jackwotherspoon commented 4 days ago

Hi @nicolasThal thanks for raising an issue on the Cloud SQL Proxy 😄 Let's see if we can get to the bottom of it for you.

The ip resolution seems ok, but the port used is not right.

The port you see is actually correct. If you were to connect directly to your instance you are right that you would use port 5432.

However, the Cloud SQL Proxy has a server-side component that sits beside the Cloud SQL instance on port 3307, when you connect via the Cloud SQL Proxy it connects securely to this port.

Image taken from "How the Cloud SQL Auth Proxy works" docs:

image

The error you are seeing dial tcp 10.250.250.5:3307: i/o timeout means the Cloud SQL Proxy client is unable to reach the Proxy server. This usually points at a networking issue.

From the above docs page:

While the Cloud SQL Auth Proxy can listen on any port, it creates outgoing or egress connections to your Cloud SQL instance only on port 3307. If your client machine has an outbound firewall policy, make sure it allows outgoing connections to port 3307 on your Cloud SQL instance's IP.

You most likely can resolve the error you are seeing by adjusting the firewall rules of the VPC network the Cloud SQL instance PSC endpoint IP address belongs to by allowing TCP egress connections over port 3307.

If the above does not work then I would recommend starting up the Cloud SQL Proxy with debug logs enabled to get more info (via the --debug-logs flag):

./cloud-sql-proxy PROJECT_ID:REGION:INSTANCE_NAME --psc --debug-logs

Hope this helps, let me know how it goes.

nicolasThal commented 4 days ago

Hello @jackwotherspoon

Thank you for your quick answer.

When I look at the egress firewall rules from the vpc, I see that there are no restriction : image

With the debug log, I have this output : 

2024/11/25 15:19:05 [PROJECT_ID:REGION:INSTANCE_NAME] Accepted connection from 127.0.0.1:56846
2024/11/25 15:19:05 [PROJECT_ID:REGION:INSTANCE_NAME] Now = 2024-11-25T15:19:05Z, Current cert expiration = 2024-11-25T16:19:00Z
2024/11/25 15:19:05 [PROJECT_ID:REGION:INSTANCE_NAME] Cert is valid = true
2024/11/25 15:19:05 [PROJECT_ID:REGION:INSTANCE_NAME] Dialing eb182387e8ba.3azh48vs5l2w4.europe-west1.sql.goog.:3307
2024/11/25 15:19:36 [PROJECT_ID:REGION:INSTANCE_NAME] Dialing eb182387e8ba.3azh48vs5l2w4.europe-west1.sql.goog.:3307 failed: dial tcp 10.250.250.5:3307: i/o timeout
2024/11/25 15:19:36 [PROJECT_ID:REGION:INSTANCE_NAME] failed to connect to instance: Dial error: failed to dial (connection name = "PROJECT_ID:REGION:INSTANCE_NAME"): dial tcp 10.250.250.5:3307: i/o timeout

In our project the 10.250.250.5 ip is used for the private service connect ip

Do you have a suggestion for something else to look to ?

Thank you for your help,

jackwotherspoon commented 3 days ago

@nicolasThal Thanks for the response! Your firewall rules look okay to me... and it seems like your DNS record is being properly resolved to the PSC endpoint IP.

Do you have a suggestion for something else to look to ?

One thing you could do to help shed some further light on the root cause of your issue would be to try connecting directly to the PSC endpoint IP from the VM instance:

psql -h 10.250.250.5 -U postgres  

If the above fails, it points at a connectivity issue between your VM and the Cloud SQL instance, unrelated to the Proxy. If the above is successful then it points at something within the Proxy being the cause.