ACCESS-NRI / access-nri-intake-catalog

Tools and configuration info used to manage ACCESS-NRI's intake catalogue
https://access-nri-intake-catalog.rtfd.io
Apache License 2.0
8 stars 1 forks source link

Switch to a date-based catalog versioning system, and related updates #243

Closed marc-white closed 1 week ago

marc-white commented 2 weeks ago

Closes #191 .

This PR significantly alters the way the catalog generated by access-nri-intake-catalog is handled, stored, and versioned.

Catalog storage location

The live catalog will now be stored on gdata/xp65 (access_nri_intake.CATALOG_LOCATION), rather than being shipped with the package. This is because all of the catalog data lives on Gadi anyway, so there doesn't seem to be much point to being able to see the catalog definition YAML, but not be able to access any data.

However, to support power users and local developers, there is the option to place a catalog.yaml file at a defined location in the user's home directory (access_nri_intake.USER_CATALOG_LOCATION). This catalog will automatically take precedence over the 'live' catalog on Gadi, if it exists. I thought about adding utility functions that would allow users to create this directory, put a catalog.yaml in it, etc.; however, given that it's a power user move, I decided it's easier and safer to have the user do that themselves manually.

Versioning

Catalog versions will now be date-based, e.g., a catalog built today will be, by default, v2024-11-07. Attempts to set a version number that doesn't conform to this pattern will raise an exception.

catalog.yaml will now contain a min and max version. This is to cover the possibility that the catalog structure may change, and a particular version of catalog.yaml may not be compatible with certain catalog versions. I've confirmed that doing the usual < and > operations on our version strings has the expected output.

Under this versioning schema, there isn't really a need to have a symlinked latest version of the catalog (and also because we can update the live catalog on xp65 at will, without doing a code release).

Building the catalog

Because catalog.yaml will be placed on xp65 now, there is no need to make a new code release for every catalog build.

As mentioned above, a catalog built today will take today's date as the default version, although the user can override it.

The build process is now intelligent to the presence of older versions of catalog.yaml, and to directories that look like older catalogs:

Documentation

To be updated once we're happy with the core structure of this update. The updates required will be:

Testing

All of the above should have at least one unit test.

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 98.83721% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.75%. Comparing base (36b8140) to head (6cd9393). Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/access_nri_intake/cli.py 97.50% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #243 +/- ## ========================================== + Coverage 96.53% 97.75% +1.22% ========================================== Files 9 10 +1 Lines 721 803 +82 ========================================== + Hits 696 785 +89 + Misses 25 18 -7 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

rbeucher commented 2 weeks ago

OK. What's the best way for us to test this on GADI?

marc-white commented 2 weeks ago

OK. What's the best way for us to test this on GADI?

The way I've done this in the past (e.g. to test #175 updates):

  1. Check out the code onto Gadi, and do a local install of the package (non-destructive override of the environment version)
  2. Copy the build_all.sh script, and modify it to point to a dummy directory, a subset of the data sources, etc.
  3. Run the copied build process as per the documentation
  4. Scream incoherently when it fails (optional), work out why it fails, run it again
  5. Repeat 3-4

The main thing to check would be to copy the existing catalog folders (e.g. v0.1.3) to the dummt directory, create symlinks to them using the date catalog format, and make sure the build process picks those up as previous versions and sets the catalog min version correctly.

rbeucher commented 2 weeks ago

Here is my first try.

I have installed a local version in my tm70 directory

 catalog-build --build_base_path /g/data/xp65/admin/intake /g/data/xp65/admin/access-nri-intake-catalog/config/cmip6.yaml 
2024-11-07 17:00:28,933 - access_nri_intake.cli - INFO - Adding 'cmip6_fs38' to metacatalog '/g/data/xp65/admin/intake/v2024-11-07/metacatalog.csv'
2024-11-07 17:00:35,879 - access_nri_intake.cli - INFO - Adding 'cmip6_oi10' to metacatalog '/g/data/xp65/admin/intake/v2024-11-07/metacatalog.csv'
Traceback (most recent call last):
  File "/g/data/tm70/rb5533/miniforge3/envs/intake/bin/catalog-build", line 8, in <module>
    sys.exit(build())
             ^^^^^^^
  File "/g/data/tm70/rb5533/PROJECTS/access-nri-intake-catalog/src/access_nri_intake/cli.py", line 249, in build
    yaml_old["sources"]["access_nri"]["parameters"]["version"]["min"]
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'min'

Is it because the cmip6 config has more than one source?

rbeucher commented 2 weeks ago

I have just tried with only one source in cmip6.yml same pb

marc-white commented 2 weeks ago

Do you have an existing catalog.yaml in the build directory? If so, it's going to look for a min value in that, and it appears to be bombing out because it can't find one. The assumption I made is that the first catalog build in the 'live' directory on xp65 (and, therefore, the first build in your dummy area) will be the first catalog.yaml to appear there, so it will inspect the present directories to compute a min and max version instead.

rbeucher commented 2 weeks ago

I have tried with an empty folder and I still get the issue...

marc-white commented 2 weeks ago

OK I'll take a look and see what's happening...

marc-white commented 2 weeks ago

Found the problem - the code is building the catalog in one place, but it's looking for the existing catalog (and, it seems, building the catalog.yaml back out to) the live location referred to by the CATALOG_LOCATION variable. I need to have a quick think about how to rectify that.

marc-white commented 2 weeks ago

@rbeucher OK, check out the latest code version and try again. You'll need to update your test version of the build_all.sh script to include the new --catalog_base_path option (see the canonical build_all.sh in the repo).

The issue was that, while changing the build_base_path option in the build script lets you stick the catalog data in a different place, the catalog yaml was always being forced into either live location on xp65, or into your home area if you have a local catalog.yaml set up. I didn't pick this up because I mocked get_catalog_fp() in most of the tests, so I never saw the effect.

I've now added an extra option to the build() function call, as well as some additional supporting changes, that let you set where the catalog.yaml lives. In production, this will be the same place as the catalog data, hence why the canonical build_all.sh script sets --build_base_path and --catalog_base_path to be the same.

charles-turner-1 commented 1 week ago

Running a build test right now - will comment if I have any issues

charles-turner-1 commented 1 week ago

@rbeucher It looks like you might have overwritten a default catalog location or similar on Friday? I'm currently getting the following error - are you able to take a look & let me know if the file in the error message is one you recognise?

>>> import intake
>>> intake.cat.access_nri
/g/data/hh5/public/apps/miniconda3/envs/analysis3-24.07/lib/python3.10/importlib/__init__.py:126](https://are.nci.org.au/g/data/hh5/public/apps/miniconda3/envs/analysis3-24.07/lib/python3.10/importlib/__init__.py#line=125): RuntimeWarning: Unable to access a default catalog location. Calling intake.cat.access_nri will not work.
  return _bootstrap._gcd_import(name[level:], package, level)
access_nri:
  args: {}
  description: ''
  driver: intake.catalog.base.Catalog
  metadata: {}

>>> from access_nri_intake.utils import get_catalog_fp
>>> get_catalog_fp()
'/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml'
>>> intake.open_catalog('/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml').access_nri
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[9], line 1
----> 1 intake.open_catalog('[/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml](https://are.nci.org.au/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml)').access_nri
...
FileNotFoundError: [Errno 2] No such file or directory: '[/g/data/tm70/rb5533/intake_tests/v0.1.4](https://are.nci.org.au/g/data/tm70/rb5533/intake_tests/v0.1.4)+13.g869f576.dirty[/metacatalog.csv](https://are.nci.org.au/metacatalog.csv)'

I've run a catalog build with just CMIP5 enabled which built without any issues - I'm just running into this slightly weird error trying to import it.

rbeucher commented 1 week ago

Yes. Sorry, my bad. It looks like I have mess up with the file

charles-turner-1 commented 1 week ago

No worries. I'm looking into it more closely now - probably something we're going to want to have some guard rails against.

charles-turner-1 commented 1 week ago

I've restored the file to default to v0.1.3 for now.

marc-white commented 1 week ago

@charles-turner-1 I think you might have to hack the code to open your existing catalog (or, better yet, put it into ~/.access_nri_intake_catalog/) - the code only knows to look for a catalog to open at either CATALOG_LOCATION = "/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml" or USER_CATALOG_LOCATION = Path.home() / ".access_nri_intake_catalog/catalog.yaml".

We should probably look at adding a path option to intake.cat.access_nri to see if we can make that more flexible.

charles-turner-1 commented 1 week ago

Sorry @marc-white, just saw your comment - the default catalog was pointing at a nonexistent file. My catalog in .access_nri_intake_catalog/catalog.yaml worked just fine.

I've approved but I think before merging we want to update docs?

marc-white commented 1 week ago

Yes, a docs update is definitely required. I can get started on that today.

marc-white commented 1 week ago

I've added the necessary documentation, take a peek...

marc-white commented 1 week ago

Other than the two minor typos, documentation changes look correct to me.

I updated cli.py::build if update: ... clause on my local machine to make it a bit more readable to check the docs - are you happy for me to push the changes to the branch?

Yep push the changes

charles-turner-1 commented 1 week ago

@marc-white Can you confirm the shorthand variable names I used in 5c80be3 aren't misleading? Specifically driver_new, etc..

Other than that I think this is good to go.

marc-white commented 1 week ago

@marc-white Can you confirm the shorthand variable names I used in 5c80be3 aren't misleading? Specifically driver_new, etc..

Looks reasonable to me!

marc-white commented 1 week ago

I'm going to get onto Gadi now and create symlinks for the existing catalog versions, so they get picked up the first time we generate a 'new-style' catalog.