aiidateam / team-compass

A repository for storing the AiiDA team roadmap
https://team-compass.readthedocs.io
MIT License
0 stars 0 forks source link

Usability: Reduce required setup to run codes through AiiDA to a minimum #5

Open sphuber opened 1 year ago

sphuber commented 1 year ago

Motivation

Currently, when one wants to run a code through AiiDA, there is quite a bit of setup that needs to be done. If one is lucky and a CalcJob plugin already exists, before it can be used a Computer and a Code first need to be created and configured. If a plugin doesn't exist, the problem get's worse as one should first develop a CalcJob and Parser plugin and even create a Python package to install them with entry points. These requirements are a huge barrier for early adopters, especially for those from fields where little to no plugins exist.

Desired Outcome

It should be as easy as possible for a new user to start running their code with AiiDA with minimal to no required setup.

Impact

If this is successful, it will significantly lower the barrier for adoption of AiiDA by new users, and especially by users of other scientific domains than we currently have.

Complexity

A solution has already been implemented in the form of a plugin and so the complexity is minimal as it does not require any changes to aiida-core.

Progress

An AEP has been created and the proposed solution has already been implemented in the plugin package `aiida-shell. Essentially, this package provides a CalcJob and Parser implementation that allows running any executable. Most importantly, it provides a simply utility function that makes launching such a CalcJob trivial, as it will automatically create the required Computer and Code on-the-fly.

The most basic example is running a bash command, e.g., date, which looks something like:

from aiida_shell import launch_shell_job
results, node = launch_shell_job(
    'date',
    arguments=['--iso-8601']
)
print(results['stdout'].get_content())

Note that all the setup that is required is an AiiDA installation with a configured profile. It is no longer necessary to configure a Computer or Code as that is done automatically. By default the command is run on the localhost but any Computer can be defined through the inputs and it will be run on the remote computer as normal with calculation jobs.

As a concrete example of how this would lower the adoption barrier for new users, let's consider an example from biochemistry using the package pdb-tools which allows to manipulate protein structures. In the command line, a typical workflow would look like:

pdb_fetch 1brs | pdb_selchain -A,D | pdb_delhetatm | pdb_tidy > 1brs_AD_noHET.pdb

With aiida-shell this can be run through AiiDA as follows:

#!/usr/bin/env runaiida
"""Simple ``aiida-shell`` script to manipulate a protein defined by a .pdb file.

In this example, we show how the following shell pipeline:

    pdb_fetch 1brs | pdb_selchain -A,D | pdb_delhetatm | pdb_tidy > 1brs_AD_noHET.pdb

can be represented using ``aiida-shell`` by chaining a number of ``launch_shell_job`` calls.
All that is required for this to work is a configured AiiDA profile and that ``pdb-tools`` is installed.
"""
from aiida_shell import launch_shell_job

results, node = launch_shell_job(
    'pdb_fetch',
    arguments=['1brs'],
)

results, node = launch_shell_job(
    'pdb_selchain',
    arguments=['-A,D', '{pdb}'],
    nodes={'pdb': results['stdout']}
)

results, node = launch_shell_job(
    'pdb_delhetatm',
    arguments=['{pdb}'],
    nodes={'pdb': results['stdout']}
)

results, node = launch_shell_job(
    'pdb_tidy',
    arguments=['{pdb}'],
    nodes={'pdb': results['stdout']}
)

print(f'Final pdb: {node}')
print(f'Show the content using `verdi node repo cat {node.pk} pdb')
print(f'Generate the provenance graph with `verdi node graph generate {node.pk}')

Note that this script is complete and no additional setup is required other than a functional AiiDA installation and the pdb-tools package installed.

Now imagine what a user would have to do without aiida-shell. Requiring users to create calculation and parser plugins for each command is simply untenable and unreasonable.

chrisjsewell commented 1 year ago

Thanks @sphuber

Usability: Reduce required setup to run codes (scripts, executables, shell commands, ...) through AiiDA

As mentioned in https://github.com/aiidateam/team-compass/issues/4#issuecomment-1424256451, I think maybe the title should be something like:

Usability: Allow users to run "plugin-less" codes, with minimal setup

then move the explanation of what you mean by a code (script, executable, shell command, ...) to the motivation section