Open e-belfer opened 1 month ago
Now that Cloud Run lets you mount buckets as volumes, it's tempting to migrate back to Cloud Run. That would look like:
devtools/datasette/publish.py
. it includes the nginx stuff too so that's good/data/nightly
pudl.catalyst.coop
as a read-only volume at /data
publish.py
line in gcp_pudl_etl.sh
gcloud run deploy SERVICE --image IMAGE_URL
main
.And to get logs:
And if we were to stick with fly.io, the plan looks like
Which looks faster - but also, I'm suspicious that fly-log-shipper doesn't actually do what it purports to, or isn't as easy as it claims to be, since it doesn't look super actively maintained.
And, the final outcome of moving back into GCP is simpler & easier to take down if we decide to move to Superset.
So, I'm going to spend 30 minutes tomorrow trying to get fly-log-shipper working. If I run into problems I'll move back into GCP.
Overview
fly.io currently doesn't retain logs for a long time so we need to use the fly log shipper to send logs to S3.
We should spend at most 10 hours on this.
Success Criteria
How will we know that we're done?
We don't need to mirror the logs into GCS or ETL them into some structured format because we are likely to deprecate the Datasette shortly anyways. This lets us do some baseline analysis without too much investment.