pangeo-forge / staged-recipes

A place to submit pangeo-forge recipes before they become fully fledged pangeo-forge feedstocks
https://pangeo-forge.readthedocs.io/en/latest/
Apache License 2.0
39 stars 63 forks source link

AWS NOAA Optimum Interpolation SST #232

Closed rbavery closed 1 year ago

rbavery commented 1 year ago

A recipe for AWS NOAA Optimum Interpolation Sea Surface Temperature, one of the three resources made available as part of NOAA's Oceanic Climate Data Records (see: https://registry.opendata.aws/noaa-cdr-oceanic/)

cloud-out issue: https://github.com/developmentseed/aws-asdi/issues/23

sharkinsspatial commented 1 year ago

/run aws-noaa-oisst-avhrr-only

pangeo-forge[bot] commented 1 year ago

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors. That feature is not quite ready yet, however, so please reach out on this thread to a maintainer, and they'll help you diagnose the problem.

cisaacstern commented 1 year ago
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/patterns.py", line 219, in __getitem__
    fname = self.format_function(**format_function_kwargs)
  File "/tmp/tmphqmu8l42/recipes/aws-noaa-oisst/recipe.py", line 22, in format_function
NameError: name 'pd' is not defined [while running 'Start|scan_file|Reshuffle_000|finalize|Reshuffle_001/scan_file/Execute-ptransform-56']
"

This is the Dataflow serialization/scoping thing again... all names used in the functional scope have to be declared within the function body. So we'll need import pandas as pd inside the format_function.

cisaacstern commented 1 year ago

@rbavery, fwiw pangeo_notebook_version is actually completely ignored at this point 😬 ... I'm working on https://github.com/pangeo-forge/meta-yaml-schema/pull/1 to make it more obvious what is required as part of meta.yaml. Apologies for the confusion!

cisaacstern commented 1 year ago

Also @rbavery seems like we've ended up with two meta.yamls, as of your last commit? That's why the synchronize CI task is now failing.

rbavery commented 1 year ago

@cisaacstern woops, my b! I removed the bad commit and moved the pandas import into the format function. also omitted pangeo_notebook_version from the yaml

rbavery commented 1 year ago

oh but I'll also need to fix one more thing, need to move the use of pandas in global to function scope.

cisaacstern commented 1 year ago

/run aws-noaa-oisst-avhrr-only

cisaacstern commented 1 year ago

Just saw your additional comment. We can just let the test I just triggered fail (no way to interrupt it). And then start a new one once you've pushed your fix.

rbavery commented 1 year ago

I think I just needed to add a pandas import at the top level in addition to within the func. I double checked the recipe and it seems to work with a pruned run.

cisaacstern commented 1 year ago

pre-commit.ci autofix

cisaacstern commented 1 year ago

pre-commit.ci autofix

Oh, this won't work because PR is from DevSeed org.

cisaacstern commented 1 year ago

/run aws-noaa-oisst-avhrr-only

pangeo-forge[bot] commented 1 year ago

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors. That feature is not quite ready yet, however, so please reach out on this thread to a maintainer, and they'll help you diagnose the problem.

cisaacstern commented 1 year ago
File "/tmp/tmpde4uf3kp/recipes/aws-noaa-oisst/recipe.py", line 21, in format_function
NameError: name 'start_date' is not defined [while running 'Start|scan_file|Reshuffle_000|finalize|Reshuffle_001/scan_file/Execute-ptransform-56']
"

Same issue! start_date needs to be (redundantly) defined within the function body. 😄

sharkinsspatial commented 1 year ago

/run aws-noaa-oisst-avhrr-only

pangeo-forge[bot] commented 1 year ago

:tada: The test run of aws-noaa-oisst-avhrr-only at 2bb7dbf1fd82ba39c879648f426decad2e4b1a30 succeeded!

import xarray as xr

store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1429/aws-noaa-oisst-avhrr-only.zarr"
ds = xr.open_dataset(store, engine='zarr', chunks={})
ds