Open jackwotherspoon opened 2 years ago
Hey @jackwotherspoon, I've hit this exact problem this week. Do you have an estimate? Any rough ideas would be greatly appreciated. Thanks in advance.
@gunar This has been prioritized and will be worked on in early 2022.
In the meantime, you can connect to Cloud SQL using Python and psycopg2 with the Cloud SQL Auth Proxy through Unix sockets or TCP. For an explanation and example see here: https://cloud.google.com/sql/docs/postgres/connect-admin-proxy#expandable-2
For the Unix sockets sample just swap out part of the pg8000 code for this code snippet:
pool = sqlalchemy.create_engine(
sqlalchemy.engine.url.URL.create(
drivername="postgresql+psycopg2",
username=db_user, # e.g. "my-database-user"
password=db_pass, # e.g. "my-database-password"
database=db_name, # e.g. "my-database-name"
query={
"host": "{}/{}".format(
db_socket_dir, # e.g. "/cloudsql"
instance_connection_name) # i.e "<PROJECT-NAME>:<INSTANCE-REGION>:<INSTANCE-NAME>"
}
),
**db_config
)
Hope this helps for the time being! Thanks for the feedback!
@jackwotherspoon Thanks man. I'm not inclined to go with the Cloud SQL Auth Proxy as that's one more thing for the user to remember to execute.
I guess I could make the python script itself spawn the process for the proxyβhas anyone attempted that?
Also, do I need sqlalchemy
in order to connect to the proxy or can I connect directly with psycopg2
?
Hi @jackwotherspoon , thank you for all your work on cloud sql connectors
Do you have any estimates for psycopg2 support? It would be really nice to have
By the way the issue has p2 priority but it has the biggest number of thumb-upsπ
Hi @kemics, thanks for the kind words and showing your enthusiasm towards the Cloud SQL Python connector.
I currently can't provide a specific timeline for psycopg2 support because there are a lot of factors at play here but I can provide some details for you.
In order to support psycopg2 connections with the connector we require the ability to pass in a pre-existing SSL connection and skip a couple of postgres database level protocols (this is the hard part and is required because Cloud SQL connections connect to a proxy server and not directly to the database itself).
I have been in talks with the psycopg2 team about helping make some changes in psycopg2 to help support these type of connections, the work also requires additional upstream changes into libpq.
Overall, there are a lot of moving parts but I will continue to keep the ball rolling on this and make sure that I update progress here on the issue for more transparency.
Thanks again for the gentle nudge here and for showing interest in the Python connector, much appreciated.
Have a great day :)
Thanks @jackwotherspoon , great explanation π
I'm a fan of what you are doing here and have successfully tried this repo with sqlalchemy and also https://github.com/GoogleCloudPlatform/cloud-sql-go-connector . Works like a charm and a huge improvement in terms of security
@jackwotherspoon
With regards to:
For the Unix sockets sample just swap out part of the pg8000 code for this code snippet:
Is there any way to use my already logged in gcloud
IAM authentication and not have to pass in username and password as part of the code? (I'm trying to get away from .env
files).
@gunar
Is there any way to use my already logged in
gcloud
IAM authentication and not have to pass in username and password as part of the code? (I'm trying to get away from.env
files).
This can be done in a fairly straight-forward way for a service account authentication using Manual IAM Database Authentication. You just need to add your service account as an IAM database user to your Cloud SQL instance and grant it the Cloud SQL Instance User IAM role alongside the usual Cloud SQL Client role.
Once this is done, the below snippet should work in any environment that uses the service account (i.e. Cloud Run, Cloud Functions, App Engine etc.) to connect to Cloud SQL via Public IP (unix sockets is recommended for Public IP, see below comment for Private IP TCP connection snippet). See comments below for running locally.
import google.auth
from google.auth.transport.requests import Request
import sqlalchemy
import psycopg2
# uncomment below two lines to set application credentials to service account locally
# import os
# os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/key.json"
# get application default credentials
creds, _ = google.auth.default()
# refresh credentials if not valid
if not creds.valid:
request = Request()
creds.refresh(request)
# IAM service account database user, postgres removes suffix
sa_iam_user = (creds.service_account_email).removesuffix(
".gserviceaccount.com"
)
# create SQLAlchemy connection pool
pool = sqlalchemy.create_engine(
sqlalchemy.engine.url.URL.create(
drivername="postgresql+psycopg2",
username=sa_iam_user, # login to database as service account user
password=str(creds.token), # manual IAM database authn
database=db_name, # e.g. "my-database-name"
query={
"host": "{}/{}".format(
"/cloudsql", # e.g. "/cloudsql"
instance_connection_name) # i.e "<PROJECT-NAME>:<INSTANCE-REGION>:<INSTANCE-NAME>"
}
),
)
# ... use connection pool
I've been struggling to get psycopg2 working with either of the solutions posted here, really appreciate the examples @jackwotherspoon - wondering if you or someone else could point me in the right direction... I need to connect to a Cloud SQL instance running Postgres with a gen2 cloud function using an IAM service account. Tried the solution posted on 16/12/21 as well as the solution posted 3/8/22 but still not getting there.
The cloud function is connected to the VPC with a serverless VPC connector, and the cloud sql instance has a private and public IP. I've also tried the below with the private IP instead of the 'project-name:instance-region:instance-name' for the unix socket.
I'm getting the error
psycopg2.OperationalError: connection to server on socket "/cloudsql/project-name:instance-region:instance-name/.s.PGSQL.5432" failed: No such file or directory
This is my code (obfuscated of course)
# get application default credentials
creds, _ = google.auth.default()
# refresh credentials if not valid
if not creds.valid:
request = Request()
creds.refresh(request)
# IAM service account database user, postgres removes suffix
sa_iam_user = 'service-account@project-name.iam'
# create SQLAlchemy connection pool
engine = sqlalchemy.create_engine(
sqlalchemy.engine.url.URL.create(
drivername="postgresql+psycopg2",
username=sa_iam_user, # login to database as service account user
password=str(creds.token), # manual IAM database authn
database='database-name', # e.g. "my-database-name"
query={
"host": "{}/{}".format(
"/cloudsql", # e.g. "/cloudsql"
'project-name:instance-region:instance-name') # i.e "<PROJECT-NAME>:<INSTANCE-REGION>:<INSTANCE-NAME>"
}
),
)
What am I doing wrong? I feel like I'm very close but obviously missing something.
@cw-optima-ovo Let me see if I can help you out! Thanks for raising your issue and sorry for the slow reply.
I should of made one thing clear in my previous comment with the code sample. The Unix sockets approach is for a Public IP connection.
The recommended approach for Private IP connections such as your own is through a TCP socket which I will showcase here: (code sample was tested for gen2 Cloud Functions with VPC Connector attached)
import google.auth
from google.auth.transport.requests import Request
import sqlalchemy
import psycopg2
# get application default credentials
creds, _ = google.auth.default()
# refresh credentials if not valid
if not creds.valid:
request = Request()
creds.refresh(request)
# IAM service account database user, postgres removes suffix
sa_iam_user = (creds.service_account_email).removesuffix(
".gserviceaccount.com"
)
# create SQLAlchemy connection pool
pool = sqlalchemy.create_engine(
# Equivalent URL:
# postgresql+psycopg2://<db_user>:<db_pass>@<db_host>:<db_port>/<db_name>
sqlalchemy.engine.url.URL.create(
drivername="postgresql+psycopg2",
username=sa_iam_user,
password=str(creds.token),
host="YOUR_PRIVATE_IP_ADDRESS",
port=5432,
database="YOUR_DATABASE",
),
)
# ... use connection pool
Let me know if you have luck with this sample over the unix sockets version π Make sure the service account that the Cloud Function uses is added as a proper Cloud SQL IAM user to the instance and has the proper roles.
@cw-optima-ovo Let me see if I can help you out! Thanks for raising your issue and sorry for the slow reply.
I should of made one thing clear in my previous comment with the code sample. The Unix sockets approach is for a Public IP connection.
The recommended approach for Private IP connections such as your own is through a TCP socket which I will showcase here: (code sample was tested for gen2 Cloud Functions with VPC Connector attached)
import google.auth from google.auth.transport.requests import Request import sqlalchemy import psycopg2 # get application default credentials creds, _ = google.auth.default() # refresh credentials if not valid if not creds.valid: request = Request() creds.refresh(request) # IAM service account database user, postgres removes suffix sa_iam_user = (creds.service_account_email).removesuffix( ".gserviceaccount.com" ) # create SQLAlchemy connection pool pool = sqlalchemy.create_engine( # Equivalent URL: # postgresql+psycopg2://<db_user>:<db_pass>@<db_host>:<db_port>/<db_name> sqlalchemy.engine.url.URL.create( drivername="postgresql+psycopg2", username=sa_iam_user, password=str(creds.token), host="YOUR_PRIVATE_IP_ADDRESS", port=5432, database="YOUR_DATABASE", ), ) # ... use connection pool
Let me know if you have luck with this sample over the unix sockets version π Make sure the service account that the Cloud Function uses is added as a proper Cloud SQL IAM user to the instance and has the proper roles.
Thanks @jackwotherspoon that does indeed work! However I ended up going down the asyncpg route with the sql python connector in the end, the reason for this was partly because I didn't want to have a fixed IP in my variables as we're deploying a project through CI/CD. So while it's solvable to get the IP with CI/CD, it's a headache we didn't want to introduce given we had the project:region:instance already available. The performance between asyncpg and psycopg2 is about the same for our context so no harm done.
Hoping to see the psycopg2 supported with the cloud sql connector in the future though, this would have been my original preference.!
Can I add a request for psycopg3 here? Or should that go in a separate issue?
Can I add a request for psycopg3 here? Or should that go in a separate issue?
@wwuck psycopg3 is already being tracked in a separate issue https://github.com/GoogleCloudPlatform/cloud-sql-python-connector/issues/219, although both will most likely be supported at the same time once we can get up-stream changes to libpq made.
FYI to everyone: it is possible to connect with automatic IAM authentication without the need for the Python Connector:
You'll need to ensure a few things:
sql.login
scope (i.e. https://www.googleapis.com/auth/sqlservice.login
)We're working on making this path easier for folks, but for now I'll share the mechanics for visibility.
Assuming you're using SQLAlchemy, you can do this:
import sqlalchemy
from sqlalchemy import event
import google.auth
from google.auth.transport.requests import Request
# initialize ADC creds
creds, _ = google.auth.default(
scopes=["https://www.googleapis.com/auth/sqlservice.login"]
)
# Cloud SQL Instance IP address
instance_ip = <INSTANCE_IP>
# create SQLAlchemy connection pool
# use Cloud SQL Instance IP + native port (5432)
# for best security use client certificates + server cert for SSL
engine = sqlalchemy.create_engine(
f"postgresql+psycopg2://service-account@project-id.iam:empty-pass@{ip_address}:5432/dbname",
connect_args={'sslmode': 'require'},
)
# set do_connect event listener
# This gets called before a connection is created and allows you to
# refresh the OAuth2 token here as needed
@event.listens_for(engine, 'do_connect')
def auto_iam_authn(dialect, conn_rec, cargs, cparams):
# refresh credentials if expired
if not creds.valid:
request = Request()
creds.refresh(request)
# use OAuth2 token as password
cparams["password"] = str(creds.token)
# interact with Cloud SQL database using connection pool
with engine.connect() as conn:
# query database
time = conn.execute(sqlalchemy.text("SELECT NOW()")).fetchone()
print(time[0])
@jackwotherspoon is there a way connecting to the instance with IAM without using sqlalchemy
but using just psycopg2
?
# example
with psycopg2.connect(url) as con:
with con.cursor() as cur:
cur.execute("SELECT * FROM test.table")
rows = cur.fetchall()
print(rows)
@bapi24
@jackwotherspoon is there a way connecting to the instance with IAM without using
sqlalchemy
but using justpsycopg2
?
Yes, you can still use the same concept as above. You can set the password
to that of the IAM Principal's OAuth2 token. You will just have to manage refreshing the credentials yourself within your application to make sure the credentials object has not expired.
However, we normally recommend taking advantage of connection pooling for production environment, thus we strongly recommend the above SQLAlchemy example for most cases.
import psycopg2
import google.auth
from google.auth.transport.requests import Request
# initialize ADC creds
creds, _ = google.auth.default(
scopes=["https://www.googleapis.com/auth/sqlservice.login"]
)
# refresh credentials if expired (manage this code in your application)
if not creds.valid:
request = Request()
creds.refresh(request)
# Cloud SQL Instance IP address
instance_ip = '<INSTANCE_IP>'
# interact with Cloud SQL database using psycopg2 connection
with psycopg2.connect(f"dbname=mydb user=sa-name@project-id.iam password={str(creds.token)} host={instance_ip} sslmode=require") as con:
with con.cursor() as cur:
cur.execute("SELECT * FROM test.table")
rows = cur.fetchall()
print(rows)
@jackwotherspoon that worked, thank you! Few Questions:
cloudsql.instances.login
IAM permission to a custom role. If we attach that to a service-account, would service-account be able to talk to the DB same way?
(https://cloud.google.com/sql/docs/postgres/authentication#:~:text=authentication.%20For%20more-,information,-%2C%20see%20Concepts%20related)gcloud auth application-default login
?psycopg
be supported in cloudsql-python-connector
anytime soon?@bapi24 Glad it worked! Let me answer your questions π
The documentation below says to use
cloudsql.instances.login
IAM permission to a custom role. If we attach that to a service-account, would service-account be able to talk to the DB same way?
The above code can be used for both an IAM User or IAM service account! So yes, you can add your service account as an IAM database user and have it login and access your DB π
Regarding the way we are getting the token here, is this a workaround for doing
gcloud auth application-default login
?
No it is not a workaround as the two perform slightly different functions (that work together). The way we are getting the token in code is fetching the Application Default Credentials from the environment and adding the required scopes for the OAuth2 token to be able to login to the DB. The gcloud auth application-default login
command is setting the Application Default Credentials for your local environment. This is the key distinction, one gets the ADC credentials and one sets the ADC credentials.
The above code will work locally and in an application hosted in the Cloud (Cloud Run, App Engine etc.) since the ADC credentials in the Cloud are set automatically for you to the service account used with the application (no need to run a gcloud auth command).
would
psycopg
be supported incloudsql-python-connector
anytime soon?
It is most likely that psycopg will not be supported anytime soon in the Cloud SQL Python Connector (if ever) due to the vast overhead required to patch the psycopg library as well as the libpq library. This work coupled with the fact that the above samples (SQLAlchemy and direct psycopg2) already work very nicely without the need for the Python Connector makes this a low priority effort.
Got it, thank you very much for explaining it in detail π
FYI to everyone: it is possible to connect with automatic IAM authentication without the need for the Python Connector:
You'll need to ensure a few things:
* The token has only `sql.login` scope (i.e. `https://www.googleapis.com/auth/sqlservice.login`) * The token isn't transmitted over an unencrypted channel. (Private IP recommended)
We're working on making this path easier for folks, but for now I'll share the mechanics for visibility.
Assuming you're using SQLAlchemy, you can do this:
import sqlalchemy from sqlalchemy import event import google.auth from google.auth.transport.requests import Request # initialize ADC creds creds, _ = google.auth.default( scopes=["https://www.googleapis.com/auth/sqlservice.login"] ) # Cloud SQL Instance IP address instance_ip = <INSTANCE_IP> # create SQLAlchemy connection pool # use Cloud SQL Instance IP + native port (5432) # for best security use client certificates + server cert for SSL engine = sqlalchemy.create_engine( f"postgresql+psycopg2://service-account@project-id.iam:empty-pass@{ip_address}:5432/dbname", connect_args={'sslmode': 'require'}, ) # set do_connect event listener # This gets called before a connection is created and allows you to # refresh the OAuth2 token here as needed @event.listens_for(engine, 'do_connect') def auto_iam_authn(dialect, conn_rec, cargs, cparams): # refresh credentials if expired if not creds.valid: request = Request() creds.refresh(request) # use OAuth2 token as password cparams["password"] = str(creds.token) # interact with Cloud SQL database using connection pool with engine.connect() as conn: # query database time = conn.execute(sqlalchemy.text("SELECT NOW()")).fetchone() print(time[0])
Thanks for providing these examples. Although it would be really good if they were in the official documentation.
One thing that we still cannot make work is doing the psycopg2 -> Google SQL Auth Proxy -> Postgres connection in GKE using automatic IAM. I was under the impression that you would not need the login token there - am I wrong?
Correct -- the Proxy will handle that for you. If you'd like to open an issue in the Proxy repo, I'd be happy to help you out.
The approach describe above assumes you bypass the Connector and Proxy entirely.
Support PostgreSQL psycopg2 driver with connector.