Closed cisaacstern closed 2 years ago
😢
I was so excited about this.
FWIW, these are tiny files.
Here's an interesting idea, I just merged https://github.com/pangeo-forge/registrar/pull/48, so we could try running this on Dataflow. 🚀
Here's how Dataflow routing works right now:
dev
are routed to Dataflowbeta
are routed to DataflowSo if I make a tag with the substring beta
in it, we should get a deployment to Dataflow 🤔
Hmm I just made a release but looks like that pathway doesn't work anymore, following migration from tag events to push events as the trigger for production runs. I'm going to think for a moment what a lightweight way of testing this on Dataflow might look like.
That sounds promising. Let me know how I can help.
Superseded by #4.
@rabernat, our first production run for this feedstock failed 😞 .
A closer look at the logs on the Prefect backend, as well as on our Loki service, reveals tracebacks like this
These connection timeout errors smell a lot like a Prefect scheduler killed by too many
store_chunk
tasks. This hypothesis is supported by the fact that this recipe defines 9226store_chunk
tasksIn the case of https://github.com/pangeo-forge/noaa-coastwatch-geopolar-sst-feedstock/issues/2#issuecomment-1108812665,
store_chunk
tasks needed to be reduced into the neighborhood of ~1500 tasks to get the production deployment to succeed.