ACCESS-NRI / access-nri-intake-catalog

Tools and configuration info used to manage ACCESS-NRI's intake catalogue
https://access-nri-intake-catalog.rtfd.io
Apache License 2.0
8 stars 1 forks source link

Build variable suggester tool #26

Open dougiesquire opened 1 year ago

dougiesquire commented 1 year ago

One current difficulty for new users trying to use the ACCESS-NRI catalog is that variable names in the catalog are taken directly from the model output and so different names can exist for the same variable in the catalog. The task of translating variables to a common vocabulary is probably too difficult/large and probably would not even be wanted by most users (many users will know the name of the variable they're looking for). Instead, we could build a tool that recommends synonym variables. E.g.

$ synonym_variables("sst")
You might also be interested in variables named: tos, ...

This tool could learn from the datastores (e.g. using standard_names and long_names) in the catalog as they're added.

paolap commented 3 months ago

"The task of translating variables to a common vocabulary is probably too difficult/large and probably would not even be wanted by most users" This is exactly what ACCESS-MOPPeR does as we do need that mapping for post-processing the variables, so it can be reused when creating intake catalogues. I just added and extra "intake' option for MOPPeR so now it can create a mapping file(that is needed for the post-processing step) and an intake catalogue which lists the simulation files as a multi-variable catalogue file with extra lines for each variable that can be mapped and/or has a standard_name. Happy to show a demo. You wouldn't want the exact same behavior but I could add an intake-nri template to my tool so it could also produce an nri-style esm-catalog

dougiesquire commented 3 months ago

That sounds neat @paolap. Having the variable mappings would be very helpful for users