Closed sl2902 closed 6 days ago
GCP credentials block using the service account credentials file
What does this mean?
You can specify token=
when instantiating GCSFileSystem, which can point to any gcloud JSON file, it sounds like this might be what you need.
This is an example of what it looks like creating a Prefect block programtically. Prefect is a workflow orchestration tool
from prefect_gcp import GcpCredentials
# replace this PLACEHOLDER dict with your own service account info
service_account_info = {
"type": "service_account",
"project_id": "PROJECT_ID",
"private_key_id": "KEY_ID",
"private_key": "-----BEGIN PRIVATE KEY-----\nPRIVATE_KEY\n-----END PRIVATE KEY-----\n",
"client_email": "SERVICE_ACCOUNT_EMAIL",
"client_id": "CLIENT_ID",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/SERVICE_ACCOUNT_EMAIL"
}
GcpCredentials(
service_account_info=service_account_info
).save("BLOCK-NAME-PLACEHOLDER")
After persisting this on Prefect Cloud, I can reference the keys inside a flow.
You can specify token= when instantiating GCSFileSystem, which can point to any gcloud JSON file, it sounds like this might be what you need.
I tried using project = credentials
where credentials are loaded from Prefect GCP Credentials block
Hope that makes sense
It is not project
but token
that you need to set to the saved credential file's location. You should also provide project=
ideally (this can matter for some operations).
I can confirm this currently works on 2024.5.0
with GOOGLE_APPLICATION_CREDENTIALS
pointing to a service_account.json
file.
@danielgafni Thanks for the update! I will check it out
I am trying to run a Prefect deployment using Docker containers. I have created a Docker container Prefect block and GCP credentials block using the service account credentials file, which I load inside of the Prefect flow. However, when I read a parquet file (I tried with both Pandas and Pyarrow), I get the following error
I don't see this issue when the script runs locally.
gcsfs version = 2023.1.0