ACCESS-NRI / access-nri-intake-catalog

Tools and configuration info used to manage ACCESS-NRI's intake catalogue
https://access-nri-intake-catalog.rtfd.io
Apache License 2.0
8 stars 1 forks source link

[DATA REQUEST] Add COSIMA Panantarctic / GFDL_OM4 Builder & Data #175

Open anton-seaice opened 4 months ago

anton-seaice commented 4 months ago

Description of the data product

<Please replace this text with a description of the data product to add to the ACCESS-NRI catalog. What data does it contain? What format is it in? Who is it useful for?>

Location of the data product on Gadi

# Checklist Add a "x" between the brackets to all that apply - [x] This data product is stable (unlikely to change substantially or move) - [x] This data product is of use to the broader community - [x] This data product is documented: [link](https://github.com/NOAA-GFDL/MOM6-examples/tree/dev/gfdl/ice_ocean_SIS2/OM4_025.JRA) - [ ] This data product is licensed under - [x] Those who want to access this data can be added to the project that houses it
anton-seaice commented 4 months ago

Following on from https://github.com/COSIMA/cosima-recipes/pull/369 , I am suggesting maybe adding _OM4_025.JRARYF to the intake catalog.

@dougiesquire - As this is a different model configuration, I guess this would require a new datastore "builder", so maybe its not worth the effort? The runs are used in cosima recipes to show examples of handling MOM6 data.

@adele-morrison - Are their companion runs to _OM4_025.JRARYF which also should be added? Can you help with the "Description of the data product" and "Location of the data product on Gadi" sections, and then I will edit the original post please?

anton-seaice commented 4 months ago

I tried using the access-om3 builder, and got these errors when using builder.parser:

{'INVALID_ASSET': '/g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output000/19000101.ice_daily.nc', 'TRACEBACK': 'Traceback (most recent call last):\n File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-24.04/lib/python3.10/site-packages/access_nri_intake/source/builders.py", line 329, in parser\n raise ParserError(f"Cannot determine realm for file {file}")\naccess_nri_intake.source.builders.ParserError: Cannot determine realm for file /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output000/19000101.ice_daily.nc\n'}

{'INVALID_ASSET': '/g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output000/19000101.ocean_daily.nc', 'TRACEBACK': 'Traceback (most recent call last):\n File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-24.04/lib/python3.10/site-packages/access_nri_intake/source/builders.py", line 329, in parser\n raise ParserError(f"Cannot determine realm for file {file}")\naccess_nri_intake.source.builders.ParserError: Cannot determine realm for file /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output000/19000101.ocean_daily.nc\n'}

dougiesquire commented 4 months ago

Ah, yet another permutation of file naming. It might be safest just to write a dedicated builder, which is straightforward. I guess it would be an Om4Builder?

Is this output structured in a similar way to the regional MOM6 output? If so, it may be worth thinking about writing a builder that handles both?

adele-morrison commented 4 months ago

Apologies for being slow. Yes, lets add the panan experiments to Intake. We'd still to like delete a bunch of the daily data for the 1/20th panan, is that ok to do after it's added to Intake? After that frees up space on ol01 ideally I'd also like to move the 1/10th panan from ik11 to ol01. But the current locations are as follows: /g/data/ol01/outputs/mom6-panan/panant-0025-zstar-ACCESSyr2/ and /g/data/ol01/outputs/mom6-panan/panant-005-zstar-ACCESSyr2/ and /g/data/ik11/outputs/mom6-panan/panant-01-zstar-ACCESSyr2/

dougiesquire commented 4 months ago

We'd still to like delete a bunch of the daily data for the 1/20th panan, is that ok to do after it's added to Intake?

I think if we know this is going to happen then it would be better to wait until it is done. We can get a Builder set up and ready to go though.

marc-white commented 2 months ago

@anton-seaice could you please add the precise location(s) of the data on Gadi?

anton-seaice commented 2 months ago

@adele-morrison is more on top of it than I am ? Noting the comments above about possibly moving it.

/g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF seem to be appropriate to get some sample data if thats what you are after ?

adele-morrison commented 2 months ago

Yes that’s the right location. Would be great to get this in the catalog so we can keep switching all the COSIMA recipes over. What do you need in terms of documentation?

adele-morrison commented 2 months ago

I don’t think there’s any plans to move OM4_025.JRA_RYF. The panan data location is still in flux. I will try to keep that moving forward.

marc-white commented 2 months ago

OK, I'll start taking a look at the current data structure and builders to see what needs to happen to get these data ingested. Stay tuned...

marc-white commented 2 months ago

The filenames all look pretty coherent, but there's a couple of things I haven't been able to work out on my own:

minghangli-uni commented 2 months ago

What is the 'static' frequency, e.g., 19000101.ocean_static.nc

It contains fields that do not change in frequency, such as grid-related data. It is saved once per run.

19000101.ocean_annual.nc

contains annually-averaged 2d fields

19000101.ocean_scalar_annual.nc

contains annually-averaged 0d fields

anton-seaice commented 2 months ago

I think we want all of those files - there is a frequency = 'fx' for the static files which exists in OM2 and OM3 datastores (and maybe others)

marc-white commented 2 months ago

Ah yes, I've found the fx frequency down in the utils package - I might variable that out so it's clearer

marc-white commented 2 months ago

Dumping this here so I can find it later (for building workable test data): https://stackoverflow.com/questions/15141563/python-netcdf-making-a-copy-of-all-variables-and-attributes-but-one

marc-white commented 2 months ago

I now have what I think is a functional AccessOm4Builder that works on /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF. Are there some other data locations that I should be attacked as a check?

dougiesquire commented 2 months ago

@marc-white, we definitely don't want to call this AccessOm4Builder. The "OM4" data at /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF is from GFDL OM4 (I think - @adele-morrison can you confirm?), not an ACCESS model.

I'd suggest seeing if the data mentioned in this comment can use the same builder. If so, then we could possibly call the builder Mom6Builder

marc-white commented 2 months ago

/g/data/ol01/outputs/mom6-panan/panant-0025-zstar-ACCESSyr2/ and /g/data/ol01/outputs/mom6-panan/panant-005-zstar-ACCESSyr2/ and /g/data/ik11/outputs/mom6-panan/panant-01-zstar-ACCESSyr2/

I've updated the Builder to be able to read the filenames found in these directories. However, I've come across an interesting conundrum whilst trying to test the resulting catalog; the data in those three directories are, when ingested in to the catalog, pretty much identical, to the point where I can't figure out how to, say, get only the data from 0025-zstar (without resorting to the obvious solution of building a catalog only from that directory). This is causing me to have issues in forming a Dask array, because the catalog doesn't understand how to merge the files (I think it is ending up with three 'layers' of the same time series, and bombs out).

For the uninitiated like myself, what is the difference between these three runs, and how can I differentiate between them in an intake/access-nri-intake way?

dougiesquire commented 2 months ago

@marc-white, each of the experiments should be separate intake-esm datastores within the catalog.

marc-white commented 2 months ago

HI @anton-seaice and @adele-morrison , I'm now at the point where I'm ready to try an all-up ingest of the data. However, the metadata.yaml for OM4_025.JRA_RYF is incomplete, and doesn't exist for the mom6-panan datasets. Could you please add one for each dataset? Instructions are here: https://access-nri-intake-catalog.readthedocs.io/en/latest/management/building.html#metadata-yaml-files

adele-morrison commented 2 months ago

I've updated the metadata.yaml for OM4_025.JRA_RYF. I think @AndyHoggANU ran it, so some of the entries are currently just me guessing what the simulation is.

We're not quite ready to add the panan simulations ending in zstar-ACCESSyr2 to Intake yet (as above), because we still need to delete a bunch of that data and shift the 1/10th deg to ol01.

But we could add /g/data/ik11/outputs/mom6-panan/panant-01-zstar-v13 and panant-01-hycom1-v13 to Intake now.

adele-morrison commented 2 months ago

@AndyHoggANU any chance you want to create the metdata.yamls for panant-01-zstar-v13 and panant-01-hycom1-v13? Or @julia-neme perhaps you could do the panant-01-zstar-v13 one? That's what you used in your paper right?

adele-morrison commented 2 months ago

I've confirmed with @AndyHoggANU and metadata.yaml for OM4_025.JRA_RYF is good to go.

AndyHoggANU commented 2 months ago

OK, there are metadata.yaml files for both panant experiments now.

marc-white commented 1 month ago

I'm about ready to try testing a catalog creation - @dougiesquire what's the best practice for getting this set up?

dougiesquire commented 1 month ago

@marc-white I think you should be able to use the build_all.sh script. If you want to just build a test version with only a reduced set of products, you can modify:

marc-white commented 1 month ago

@AndyHoggANU painfully small issue with the metadata you've provided for those experiments - the keyword note needs to be notes instead. Could you please update?

AndyHoggANU commented 1 month ago

Done!

marc-white commented 1 month ago

Thanks @AndyHoggANU . Next issue: the resolution information needs to be an array (i.e., a bullet point list). So:

nominal_resolution:
  - 0.1 degrees

instead of nominal_resolution: 0.1 degrees.

In the access_med_0.6 conda environment, there is a metadata-validate utility that can be used to do further checks on your metadata files:

module use /g/data/xp65/public/modules
module load conda/access-med-0.6
metadata-validate </path/to/metadata.yaml>

I'm finding this sequential fixing annoying, so while you work on that, I'm going to add a feature request (#216) to modify the validator to provide a report on all problems, rather than bailing on the first one it finds.

AndyHoggANU commented 1 month ago

OK, I've done the one in panant-01-hycom1-v13 - there is still a metadata-validate error, but I couldn't interpret it!! Any suggestions?

Note that I couldn't fix the panant-01-zstar-v13/metadata.yaml as it's owned by @julia-neme .

julia-neme commented 1 month ago

I've modified the resolution to be as a bullet point and note -> notes. Let me know if you need anything else!

AndyHoggANU commented 1 month ago

And it looks like all 3 are validating :-)

marc-white commented 1 month ago

They are! However, they're now failing in concert - the experiments panant-01-hycom1-v13 and panant-01-zstar-v13 are sharing an experiment_uuid. These need to be unique.

julia-neme commented 1 month ago

Mmm I thought I'd created one by running uuidgen. How can I verify that?

marc-white commented 1 month ago

@julia-neme that's the correct process, but every metadata.yaml in the access-nri-intake-catalog needs to have a unique UUID. The easiest solution would be to just use uuidgen to create a new UUID for one of the experiments.

(The UUID holds no significance other than being an (almost certainly) unique identifier. There's no need to match UUID between experiments, make sure certain values are there, etc., so long as the value is a valid UUID.)

julia-neme commented 1 month ago

Yeah, but I thought I had created one! How can I verify whether I have created it or not? In other words, how can I check that the uuid for that experiment matches the metadata?

adele-morrison commented 1 month ago

Possibly did @AndyHoggANU just copy your uuid @julia-neme ?

marc-white commented 1 month ago

@julia-neme I don't believe the experiment has an intrinsic UUID of its own (unless I'm missing something). We just have uuidgen pump out random UUID values and attach them to experiments to give us a unique identifier for use within the catalogue.

anton-seaice commented 1 month ago

For new experiments, Payu handles creating the UUID & probably includes it in the path.

This only was implemented in the last couple of releases, see https://payu.readthedocs.io/en/stable/config.html#experiment-tracking

julia-neme commented 1 month ago

So @marc-white you want me to run uuidgen again? I don't want to mess up anything, that's why I'm asking.

marc-white commented 1 month ago

@julia-neme yes, please do. I don't see an intrinsic UUID anywhere in there.

julia-neme commented 1 month ago

Done! Let me know if it is not working now :)

marc-white commented 1 month ago

Looks like the system is happy with the new metadata files! It's now barfing trying to ingest the actual data, but that's a me problem :)

AndyHoggANU commented 1 month ago

Yeah, I reckon that was my fault in copying @julia-neme 's metadata file, sorry!

marc-white commented 1 month ago

@rbeucher I'm hitting an issue when trying to build a test catalog via Gadi using the build_all.sh script with the access-med-0.6 environment:

access_nri_intake.source.builders.ParserError: Parser returns no valid assets.
            Try parsing a single file with Builder.parser(file)
            Last failed asset: /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output094/19940101.ocean_static.nc
            Asset parser return: {'INVALID_ASSET': '/g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output094/19940101.ocean_static.nc', 'TRACEBACK': 'Traceback (most recent call last):\n  File "/home/120/mcw120/access-nri/access-nri-intake-catalog/src/access_nri_intake/source/builders.py", line 594, in parser\n    ) = cls.parse_access_ncfile(file)\n  File "/home/120/mcw120/access-nri/access-nri-intake-catalog/src/access_nri_intake/source/builders.py", line 299, in parse_access_ncfile\n    with xr.open_dataset(\n  File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.6/lib/python3.10/site-packages/xarray/backends/api.py", line 588, in open_dataset\n    backend_ds = backend.open_dataset(\n  File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.6/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 645, in open_dataset\n    store = NetCDF4DataStore.open(\n  File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.6/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 375, in open\n    import netCDF4\n  File "/g/data/xp65/public/apps/med_conda/envs/access-med-0.6/lib/python3.10/site-packages/netCDF4/__init__.py", line 3, in <module>\n    from ._netCDF4 import *\nImportError: libmpi.so.40: cannot open shared object file: No such file or directory\n'}

I don't have this problem attempting to Mom6Builder.parser this particular file via a Jupyter notebook using the same environment. Any thoughts?

rbeucher commented 1 month ago

Have you added ik11 to the PBS storage flag?

marc-white commented 1 month ago

Yes, the test build script includes access to ik11:

#!/bin/bash -l

#PBS -P iq82
#PBS -l storage=gdata/tm70+gdata/xp65+gdata/ik11+scratch/tm70+gdata/al33+gdata/rr3+gdata/fs38+gdata/oi10
#PBS -q normal
#PBS -W block=true
#PBS -l walltime=01:00:00
#PBS -l mem=96gb
#PBS -l ncpus=24
#PBS -l wd
#PBS -j oe
rbeucher commented 1 month ago
ImportError: libmpi.so.40: cannot open shared object file: No such file or directory

Seems to be the pb. I had that before. It can be a temporary issue with the filesystem. I think @rhaegar325 had that recently.

marc-white commented 1 month ago

@rbeucher I wonder if this is a file permissions error. The file it complains about:

[mcw120@gadi-login-03 access-nri]$ ls -l /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output094/19940101.ocean_static.nc
-rw-r-----+ 1 amh157 ik11 16343713 Nov 20  2021 /g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/output094/19940101.ocean_static.nc

Note that only group members can read the file. I'm a group member of ik11, but what group(s) is my job considered to be in after qsub?

rbeucher commented 1 month ago

It's possible indeed. Are you a member of ik11? All your group membership should be passed to PBS but there might be something weird going on.