interTwin-eu / itwinai

Advanced AI workflows for digital twins applications in science.
https://itwinai.readthedocs.io
MIT License
13 stars 5 forks source link

Containerized workflows on HPC #53

Open matbun opened 1 year ago

matbun commented 1 year ago

Workflow steps are currently Python environments. To integrate with the infrastructure, they must be converted into containers, orchestrated by, e.g., Apache Airflow.

Goal: execute DT workflows on a Kubernetes cluster where each step is deployed as an independent container. Orchestrations can be achieved by means of some "advanced" orchestrator (e.g., Airflow), or by executing a DT workflow step-by-step in a very similar way as it is currently done by run-workflow.py now. The only difference is that the command is executed in a container, rather that in a Python virtual environment.

MrGweep commented 1 year ago

After discussion with @garciagenrique we agreed on the following steps: 1) migrate to containers by creating container for each step in the workflow (currently only two steps are implemented preprocess and ai) 2) run containers via SLURM on HPC 3) run containers with CWL in toil with SLURM backend

MrGweep commented 1 year ago

UPDATE: