coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

Coiled not logging on GCP #292

Closed adkinsjd closed 1 month ago

adkinsjd commented 1 month ago

Hi, I am not seeing logs from my coiled clusters.

I instead see the error

Unable to fetch logs from BigQuery. Google returned this error: 400 sheerwater:coiled_logs.syslog_* does not match any table. Location: US Job ID: 22b7ce4c-2124-4fa9-aca3-b105c2173e0f

I am running on Google Cloud and I have ensured that the coiled service account has access to BigQuery and that BigQuery is enabled.

Any idea how I can get logging working?

Thank you!

dchudz commented 1 month ago

Hi!

Short version: I think you have an orgnanization policy in place restricting what domains can be given access to the bucket. We could chat about this and discuss options if you want to email support@coiled.io.

Longer:

The way we get logs into BigQuery is via export from GCP Cloud Logging, which involves setting permissions on the bigquery dataset similar to this. (The Cloud Logging GCP Service needs to be able to write to the dataset.)

I'm seeing this in our internal logs:

google.api_core.exceptions.Forbidden: 403 PATCH https://bigquery.googleapis.com/bigquery/v2/projects/[REDACTED]prettyPrint=false: IAM setPolicy failed for Dataset [REDACTED]: One or more users named in the policy do not belong to a permitted customer.

(I redacted a couple details there since this is a public repo.)

Looking at these docs, my interpretation is that your project has an organization policy in place preventing you/us from setting permissions the way we need in order for your Dask clusters to write to the bucket.

I'm curious if you're seeing logs in GCP Logging? Hopefully they're going there so you can at least see them via GCP (and if they're not, that will also need to be solved), but solving the BigQuery issue is necessary to get them in the Coiled UI.

adkinsjd commented 1 month ago

Changing the organization resource sharing policy resolved the issue. Thank you!