aiidateam / aiida-code-registry

Registry of simulation codes and computers for easy setup in AiiDA.
2 stars 11 forks source link

Ideas for making it easier to distribute computer/code environments #60

Open ltalirz opened 2 years ago

ltalirz commented 2 years ago

Distributing a set of pre-configured computers and codes is a problem that needs to be solved in basically all groups who want to start using AiiDA. Currently, the person in charge needs to write some hand-crafted python scripts (e.g. like this one) in order to automate the computer/code setup, which is tedious and extra code that needs to be maintained.

One way to improve the user experience could be the following:

I'll open a draft pull request against the aiida-code-registry that outlines how this could work. It's not yet touching AiiDA core (this could be done later, if others agree that this would be a welcome feature).

Pinging @unkcpz for info

unkcpz commented 2 years ago

@ltalirz Thanks for the ideas!

I really like the idea that having a subfolder in AIIDA_PATH to store computer/code config files and using verdi config to turn on/off the auto detect and setup on the fly for load_computer/code. For this reason, the current yaml file is enough and to me, it is more clear to have one configured file for one computer/code setup. Multiple configure items in one or few files indeed is more easy to distribute, but when come to distribute and reuse problem, the user still need to involve with dump/import configs to/from somewhere. To me, it seems a bit hacky that user need to go to AIIDA_PATH to get the config files.

The current verdi computer setup --config is not fully mutually exclusive with the interactive prompt setup if --noninteractive is not used. If the template is used, do you think it is hard to implement a dynamic CLI for the template, and how it should prompt work for the template fields?

Moreover, I can image that the duplicate label issue will be a key barrier for this implementation, correct? If the code/computer in database has the same label as one of the code/computer in the subfolder, there should be a priority to choose one of them, or raise a conflict error when that config is on.

ltalirz commented 2 years ago

I really like the idea that having a subfolder in AIIDA_PATH to store computer/code config files and using verdi config to turn on/off the auto detect and setup on the fly for load_computer/code.

Cheers!

The current verdi computer setup --config is not fully mutually exclusive with the interactive prompt setup if --noninteractive is not used. If the template is used, do you think it is hard to implement a dynamic CLI for the template, and how it should prompt work for the template fields?

I haven't looked into the details but it should be possible.

We will also need a non-interactive way of providing the template variables, however (e.g. for usage in scripts as in https://github.com/aiidateam/aiida-code-registry/pull/61).

For this reason, the current yaml file is enough and to me, it is more clear to have one configured file for one computer/code setup. Multiple configure items in one or few files indeed is more easy to distribute, but when come to distribute and reuse problem, the user still need to involve with dump/import configs to/from somewhere. To me, it seems a bit hacky that user need to go to AIIDA_PATH to get the config files.

I guess that is a matter of taste... to me being able to drop in files there is very convenient.

Moreover, I can image that the duplicate label issue will be a key barrier for this implementation, correct? If the code/computer in database has the same label as one of the code/computer in the subfolder, there should be a priority to choose one of them, or raise a conflict error when that config is on.

Right - when requesting to load a code, for which already multiple nodes exist, or when trying to set up a code with a label that already exists, this solution should just reraise the Multiple... exception and not do anything.

In my experience it is typically anyhow not useful to have multiple codes/computers with the same label.