Closed cisaacstern closed 2 years ago
I think this is somewhat uncontroversial, and falls more so within the Pangeo Forge technical domain (as opposed to the cloud deployment side), so in the interest of moving forward, I'll merge now. We can always adjust later as needed.
This PR does two things. The first is a fix, which is to get the logger named
"pangeo_forge_recipes"
, because: (a) there actually is no module and/or logger named"pangeo_forge"
in the bakery image we are using; and (b) the previous approach which, IIUC, aims to log only forpangeo_forge_recipes.recipes.xarray_zarr
, doesn't capture logging from other modules (e.g.storage
) which are important for debugging.The second thing this PR does is register a
copy_pruned
subset of the recipe object, rather than the full timeseries. This is perhaps a bit more opinionated of a choice. I'm making it because at this stage of testing, IMHO, we don't need to be moving a lot of data around, but rather just ensuring that we can get end-to-end on a single recipe execution cycle. (And this is what thecopy_pruned
method was designed for: to provide a smaller subset of the recipe for workflow debugging.)xref https://github.com/pangeo-forge/pangeo-forge-gcs-bakery/issues/19#issuecomment-1028977632