ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Add a `site` option to the `get_config_user` command #1706

Open remi-kazeroni opened 2 years ago

remi-kazeroni commented 2 years ago

Is your feature request related to a problem? Please describe. Often when a new user tries to configure the Tool on a supported machine with esmvaltool config get_config_user, they get confused with uncommenting the right lines to set the correct rootpath and drs. Would it be a good idea to add an option, say --site, to the command so that the correct rootpath and drs are automatically set for the chosen site. For example esmvaltool config get_config_user --site DKRZ would return a config file with uncommented rootpath and drs for DKRZ. If no option is passed, the default is to copy the entire config-user.yml as before.

Would you be able to help out? Would you have the time and skills to implement the solution yourself? Yes, I can try to do that. My question is:

  1. Would it be better to have shorter versions of each config-user file containing only the rootpath and drs for one site? So when a user does esmvaltool config get_config_user --site DKRZ, this would copy ~/ESMValCore/esmvalcore/config-user-DKRZ.yml to ~/.esmvaltool/config-user.yml?
  2. Or would it be preferable to stay with one global config-user.yml that is copied into ~/.esmvaltool/config-user.yml with the DKRZ rootpath and drs if the command used is esmvaltool config get_config_user --site DKRZ?
schlunma commented 2 years ago

I really like this idea!

In my opinion option 2 is better since it avoids duplication (we would need to change many files if we decide to modify keys in the config-user file!). Would it also be an idea to have one "generic" config-user file that includes all keys and a generic entry for rootpath and drs (without all the commented site-specific information), and then multiple site-specific files (e.g., dkrz.yml) that only include the rootpath and drs entries? That way we do not duplicate anything.

valeriupredoi commented 2 years ago

I too like this and like option numero deux better too, good call, Remi! :beer:

zklaus commented 2 years ago

I definitely like the sentiment, however, I think the real real solution would be different. We should have a kind of "hierarchical" configuration system as is common in most software packages, where we have an order of places where configuration can be (typically something like $CONDA_PREFIX/etc/esmvaltool, /etc/esmvaltool/, ~/.config/esmvaltool, ..., environment variables). Then, site admins can place the relevant bits in the right place, we can slim down the configuration the user has to deal with, and we increase forward compatibility because settings that are usually not maintained by the user are easier to change.

I am not sure to which degree the "experimental" config interface that @Peter9192 or @stefsmeets started some time ago already moves in that direction. We might also lean on/be inspired by the way Dask does configuration, or other packages.

bouweandela commented 2 years ago

See https://github.com/ESMValGroup/ESMValCore/issues/795 and the issues linked in it for some previous ideas on this topic. A related idea (maybe the next step), was to find out from the hostname what cluster someone is on and automatically use the right configuration.

remi-kazeroni commented 1 year ago

From Carsten Ehbrecht who reviewed our IS-ENES3 deliverable D9.5:

* The configuration of the site specific details (DKRZ, Jasmin, …) was confusing. We edited the yaml file … got syntax errors due to wrong spaces … the comments and disabled configuration lines where not clearly recognised. Since # is used for plain comments and “commented code", users uncommenting the code by replacing the # by a space will yield an error. Users not experienced with YAML may face problems.

rootpath:
 rootpath:                          → won’t work

Also, what a non-bd0854 levante user should uncomment, remains unclear from the description. E.g.

# Site-specific entries: DKRZ-Levante
# For bd0854 members a shared download directory is available
#offline: false
#download_dir: /work/bd0854/DATA/ESMValTool2/download
# Uncomment the lines below to locate data on Levante at DKRZ.
#auxiliary_data_dir: /work/bd0854/DATA/ESMValTool2/AUX → k204xxx has no read permissions
 rootpath:
   CMIP6: /work/bd0854/DATA/ESMValTool2/CMIP6_DKRZ → k204xxx has read permissions

* Suggestion: use a cookiecutter template to generate a site/user specific ESMValTool configuration? https://www.cookiecutter.io/