pangeo-forge / pangeo-forge-orchestrator

Database API and GitHub App backend for Pangeo Forge Cloud.
https://api.pangeo-forge.org/docs
Apache License 2.0
4 stars 1 forks source link

Document how any admin can manually browse dataflow logs #146

Closed cisaacstern closed 10 months ago

cisaacstern commented 2 years ago

Lifting this into its own issue, because it's really important.

Any admin (regardless of whether they are a member of the pangeo-forge-4967 GCP project) can query logs for Dataflow jobs using the secrets/dataflow-job-submission.json service account key contained in this repo.

This is an essential tool for supporting recipe contributors, e.g. on staged-recipes, when jobs fail.

https://github.com/pangeo-forge/pangeo-forge-orchestrator/issues/145#issuecomment-1261544227 describes how to do this.

Knowing the job_id is key here, so fixing #145 will make this easier. Once #145 is fixed, the process described in https://github.com/pangeo-forge/pangeo-forge-orchestrator/issues/145#issuecomment-1261544227 should be added to docs/README.md

cc @andersy005

cisaacstern commented 2 years ago

In addition to gcloud logging CLI, @yuvipanda's logs.py linked in https://github.com/pangeo-forge/pangeo-forge-orchestrator/issues/145#issuecomment-1261535709 and explained further in https://github.com/pangeo-forge/pangeo-forge-orchestrator/issues/145#issuecomment-1261606805 should also be described in this doc.

Note that I'm not certain that the secrets/dataflow-job-submission.json service account creds will work for all operations supported in this logs.py, but this service account's roles could be adjusted if necessary, to ensure that Pangeo Forge Orchestrator admins can use these creds for calling Yuvi's logs.py, even if they are not members of the pangeo-forge-4967 GCP project.