Open agoscinski opened 1 month ago
For icon we need to adapt the behavior, because we rather want to change the namelist and keeping it over the calculation constant, so we can move it to the task definition. We can maybe make a calcjob out of the calcfunction https://github.com/aiida-icon/aiida-icon/blob/a982d8792006bf234fe79c18aa76fd2af7a3463f/src/aiida_icon/iconutils/masternml.py#L43-L51 that adapts the name list so we can provide a simpler use for the user, for arbitrary changes. We will use then this calcjob also to adapt it for the inputs we can infer from the workflow (date, output of last icon last run will not be passed in the aiida way but just calls this calcjob to update the namelist with the new file).
The port_name
, maybe rename to input_key
or input_slot
.
We discussed how we deal with computer and code. For computer definition we stick with verdi, but for codes it might be useful to just just pass the filepath since we want not that the user creates a new code all the time when icon is recompiled. How to create a label in this case is still an open question. We could hashing the binary but for icon
this can be 200MB which needs to be send over the transport plugin. One proposition was to hash the filepath as label, it was an open question if we generate a new uuid that preserve provenance also in cases the code is recompiled.
We will use then this calcjob also to adapt it for the inputs we can infer from the workflow (date, output of last icon last run will not be passed in the aiida way but just calls this calcjob to update the namelist with the new file).
Here, for input data, we have to choices: either we adapt the namelist with the valid absolute path to the corresponding data or we leave a constant relative path in the namelist and symlink the actual data to the correct relative path in the working directory of the job.
Idea
AiiDA plugins define their inputs and outputs in their
CalcJob
s andẀorkchain
s with specific names. For example the arithmetic add CalcJob) has the inputsx
andy
as well as the outputsum
. We therefore need to specify these ports (how AiiDA calls them) in the yaml file to create the workgraph. In the aiida-shell plugin we did not need to do this because Each plugin defines a entry point which we can use to load the correspondingCalcJob
orWorkChain
using the factoriesSo with these two additional information (the entry point and the port names )in the YAML file we can run almost arbitrary calculations from aiida plugins (including aiida-icon). The reason why we did not need the port names for aiida-shell is because
ShellJob
creates dynamically its output ports from the outputs that are provided as inputs, so we took this to our advantage and use the name specified in the yaml file as output port names. For the input ports we also simplify the actual ports that would benodes
andarguments
(see code). The gist is that we treat aiida-shell differently, and we should continue to do so, because otherwise it becomes cumbersome to use.YAML syntax
Here you find (an example to run arithmetic add)[https://github.com/C2SM/ETHIOPIA/blob/plugins/tests/files/configs/test_config_small.yml]. A snippet of it to show how it is used to define a workflow.
Since the same data object can be used for different ports we need this information in the cycles.
Definition of computer and code
We follow more the aiida logic to define computer and code information by just specifying the label given on definition.
This has the strong advantage that we do not have to write our own logic to parse all the computer information and can use the well maintained CLI
verdi
from aiida to allow the user to create it before. It is in this PR because it was required for testing, but should be separated out in a different PRCurrent state of the code
Currently the code in the
workgraph.py
using different functions to create plugins that are notShellJob
s, and I am not sure if this is smart or not. It is a tradeoff between code duplications and flexibility, and requires a bit more thoughts and decisions how we go with this.