COSIMA / cosima-recipes

A cookbook of recipes (i.e., examples) for analysing ocean and sea ice model output
https://cosima-recipes.readthedocs.io
Apache License 2.0
46 stars 66 forks source link

Add comment on ` compat="override"` needed for sea ice variable in Intake Tutorial #460

Open adele-morrison opened 1 month ago

adele-morrison commented 1 month ago

Apparently compat="override" is needed when opening sea ice data using Intake to deal with CICE coordinates better. It would be useful to add a comment into ACCESS-NRI_Intake_Catalog.ipynb to explain this.

navidcy commented 1 month ago

@anton-seaice could you help with that?

anton-seaice commented 1 month ago

There is some coverage at

https://cosima-recipes.readthedocs.io/en/latest/Tutorials/ACCESS-NRI_Intake_Catalog.html#1.-Speeding-up-opening-your-datasets

Do we want it more explicit?

adele-morrison commented 1 month ago

Yeah I guess I was thinking under that section (or elsewhere), we could explicitly mention that this should (always?) be used for sea ice variables.

Thomas-Moore-Creative commented 1 month ago

@adele-morrison , @anton-seaice , @navidcy et al.

Collectively we're understanding the importance of problem-specific or dataset-specific xarray_kwarg settings. Can we crowdsource best practice by adding a single "kitchen sink" json file for all tweaks and settings?

Is a "good enough" next step:

Nothing fancy but could be maintained by the community?

Thoughts on this?

navidcy commented 3 weeks ago

It sounds good but I don't understand tbh what it involves, how will it work, how fragile would be, and what would involve from the users side.

navidcy commented 3 weeks ago

There is some coverage at

https://cosima-recipes.readthedocs.io/en/latest/Tutorials/ACCESS-NRI_Intake_Catalog.html#1.-Speeding-up-opening-your-datasets

Do we want it more explicit?

@anton-seaice at the moment this is under the section "Speeding up" which doesn't sound imperative for users to do. But @adele-morrison is implying that this has to be done. If that's so, then let's write it explicitly somewhere else also?

Have I understood correctly?

(Even if adding the compat="override" is not that it "has to be done" but it does help 99.99% of the times, then it's good to explicitly say this to users as a "rule" so that users struggle less.)

anton-seaice commented 3 weeks ago

Yes correct.

Its might be good to encourage teaching of principles rather than rules, as we all look at data that isn't access-OM2. It's not totally risk free to just always use these keywords incase the data you are loading is not well curated and it ends up stopping xarray doing checks that would give a useful warning. Also, hopefully this will problem will get handled better in CICE6/OM3 output.

navidcy commented 3 weeks ago

True true!

Thomas-Moore-Creative commented 3 weeks ago

Its might be good to encourage teaching of principles rather than rules

Can it be both? "teaching principles" in tutorial notebooks but abstracting the details out of the way in well documented functions that take "rule" based settings from curated, community-built config files?

_some_random_config.yaml_ ( that does not directly address compat="override" kwarg )

catalog_search_query_dict:
  ACCESS_ESM15:
    all_ocean:
      realm: ['ocean','ocnBgchem']
      source_id: 'ACCESS-ESM1-5'
    MY_PROJECT:
      experiment_id: ['historical','piControl','ssp126','ssp370','ssp585']
      source_id: 'ACCESS-ESM1-5'
      variable_id: ['intpp','thetao']
      realm: ['ocean','ocnBgchem']
      frequency: 'mon'
      file_type: 'l'
chunking:
  ACCESS_ESM15_2D: #{'chunks':{'member':1,'time':220,'j':300,'i':360}}
    chunks:
      member: 1
      time: 220
      i: 360
      j: 300
  ACCESS_ESM15_3D: #{'chunks':{'member':?,'time':?,'lev':-1,'j':-1,'i':-1}}
    chunks:
      member: 1
      time: 12
      lev: -1
      i: -1
      j: -1
Thomas-Moore-Creative commented 3 weeks ago

It sounds good but I don't understand tbh what it involves, how will it work, how fragile would be, and what would involve from the users side.

I'm wondering out loud here partly to have others tell me I'm pointed in the wrong direction (or not).

IMO this could involve:

Users could choose to: