leap-stc / data-management

Collection of code to manually populate the persistent cloud bucket with data
https://catalog.leap.columbia.edu/
Apache License 2.0
0 stars 5 forks source link

Adding recipe for GODAS #21

Open jbusecke opened 1 year ago

jbusecke commented 1 year ago
jbusecke commented 1 year ago

Well I hadnt seen this before... @cisaacstern

image
jbusecke commented 1 year ago

I almost got the example to run, but I am getting this error:

ValueError: Cannot combine fragments. Expected a hypercube of shape [3] but got 6 fragments. [while running 'Create|OpenURLWithFSSpec|OpenWithXarray|Preprocess|StoreToZarr/StoreToZarr/Rechunk/MapTuple(combine_fragments)-ptransform-31']

indicating to me that this is still not properly concatenting/merging the individual datasets? @cisaacstern does this look familiar to you?

cisaacstern commented 1 year ago

@jbusecke apologies for missing this notification. Yes, this seems to be the same issue as https://github.com/pangeo-forge/pangeo-forge-recipes/issues/517, which I am working on a fix for in https://github.com/pangeo-forge/pangeo-forge-recipes/pull/521.

Let's remember to retry this once that fix goes in.

cisaacstern commented 1 year ago

Also, the rate limit error above is presumably due to these lines in the action:

https://github.com/pangeo-forge/deploy-recipe-action/blob/bda89250ed9ee256af1d0cac3e951c772ce7b53b/action/deploy_recipe.py#L58-L60

We could resolve this issue if there was a way to rework the action so that it does not need to call back into the GitHub API, and instead relies only on injected environment variables for the necessary context. As of this moment, I'm not sure if/how we might accomplish that. I'll open an issue on the action for that.

jbusecke commented 1 year ago

pre-commit.ci autofix

jbusecke commented 1 year ago

@cisaacstern if you have a minute, could you look into this error:

Traceback (most recent call last):
[59](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:60)
  File "/srv/conda/envs/notebook/bin/pangeo-forge-runner", line 5, in <module>
[60](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:61)
    from pangeo_forge_runner.cli import main
[61](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:62)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_runner/cli.py", line 3, in <module>
[62](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:63)
    from .commands.bake import Bake
[63](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:64)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_runner/commands/bake.py", line 15, in <module>
[64](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:65)
    from ..storage import InputCacheStorage, MetadataCacheStorage, TargetStorage
[65](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:66)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_runner/storage.py", line 2, in <module>
[66](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:67)
    from pangeo_forge_recipes.storage import CacheFSSpecTarget, FSSpecTarget, MetadataTarget
[67](https://github.com/leap-stc/data-management/actions/runs/5426518167/jobs/9868667430?pr=21#step:5:68)
ImportError: cannot import name 'MetadataTarget' from 'pangeo_forge_recipes.storage' (/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/storage.py)

Wondering if this is related to https://github.com/pangeo-forge/pangeo-forge-recipes/pull/527 and if there is an easy way for me to circumvent this? Maybe just getting pgf-recipes from the beam-refactor branch is the solution?

cisaacstern commented 1 year ago

Wondering if this is related to https://github.com/pangeo-forge/pangeo-forge-recipes/pull/527 and if there is an easy way for me to circumvent this? Maybe just getting pgf-recipes from the beam-refactor branch is the solution?

Yes that's correct. And it also reflects that we should have a better plan for integration testing of recipes + runner.

I'll take a closer look at a fix this afternoon. Thanks for raising this, Julius!

cisaacstern commented 1 year ago

Opened https://github.com/pangeo-forge/pangeo-forge-runner/issues/76 to track the issue above. Turns out I won't have time to work up a PR today, but can look at it next week.

cisaacstern commented 1 year ago

@jbusecke, following merge of https://github.com/pangeo-forge/pangeo-forge-runner/pull/77, the issue above should be fixed. Can you try re-deploying the test? (I just checked if I could, but I think I don't have permissions to do so on this repo.)

jbusecke commented 1 year ago

Rerunning now. Let me also make a dev team here and give you permissions.

jbusecke commented 1 year ago

Yay! The deployment worked. Could you try to rerun the test, just to confirm that the 'maintain' role (via new data-management-devs team) has enough permissions to do so?

cisaacstern commented 1 year ago

Yep! I can now add/remove labels! Thanks Julius!

jbusecke commented 1 year ago

So both runs succeed!

image

Let me try to add all the variables now.

jbusecke commented 1 year ago

pre-commit.ci autofix

jbusecke commented 1 year ago

pre-commit.ci autofix

cisaacstern commented 1 year ago

Looking promising! 😄

jbusecke commented 1 year ago

Deployment was successful, but there is an issue with combining the surface and depth resolved datasets. Debugging that rn.

jbusecke commented 1 year ago

Hmm this looks like some sort of bug in the recipe to me.

This particular dataset contains surface variables like this:

image

and full 3d variables like this:

image

When merging this I would expect it to just work, and in the manual case it does:

import fsspec
import xarray as xr

def make_full_path(variable, time):
    return f'https://downloads.psl.noaa.gov/Datasets/godas/{variable}.{time}.nc'

with fsspec.open(make_full_path('vcur', 1980)) as f:
    with fsspec.open(make_full_path('sshg', 1980)) as f_surf:
        ds = xr.open_mfdataset([f, f_surf])
ds

gives

image

but from the dataflow logs I am getting an error message like this:

apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive File "apache_beam/runners/worker/operations.py", line 1238, in apache_beam.runners.worker.operations.PGBKCVOperation.process File "apache_beam/runners/worker/operations.py", line 1267, in apache_beam.runners.worker.operations.PGBKCVOperation.process File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/combiners.py", line 33, in add_input accumulator.add_input(schema, position) File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/aggregation.py", line 70, in add_input self.schema = _combine_xarray_schemas(self.schema, s, concat_dim=self.concat_dim) File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/aggregation.py", line 82, in _combine_xarray_schemas dims = _combine_dims(s1["dims"], s2["dims"], concat_dim) File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/aggregation.py", line 109, in _combine_dims raise DatasetCombineError(f"Dimensions for {dim} have different sizes: {l1}, {l2}") pangeo_forge_recipes.aggregation.DatasetCombineError: Dimensions for level have different sizes: 40, 0 [while running 'Create|OpenURLWithFSSpec|OpenWithXarray|Preprocess|StoreToZarr/StoreToZarr/DetermineSchema/CombineGlobally(CombineXarraySchemas)/KeyWithVoid-ptransform-48']

This indicates to me that somwhere in the aggregation code the surface variables either get promoted to have a scalar level dimension, or that the aggregation can simply not handle this case yet.

@cisaacstern I think we should add this sort of pattern to the PGF-recipes tests?