Closed vsoch closed 1 year ago
The capability is needed by a use case from the AHA MoleS workflow that currently runs under Flux. The workflow is composed of MPI-based docking + PyTorch-based fusion. I expect we will see many more workflows that consist of multiple containers per pod in the future.
I don't have an example manifest for this workflow yet, but plan to start work on porting the workflow once the basic capability is in the Flux Operator. (I'm also planning a hackathon with the workflow developers.) Here's an basic example of running two containers in a pod with the MPI Operator (untested):
apiVersion: kubeflow.org/v2beta1
kind: MPIJob
metadata:
name: lammps+amg
spec:
slotsPerWorker: 1
runPolicy:
cleanPodPolicy: Running
sshAuthMountPath: /root/.ssh
mpiReplicaSpecs:
Launcher:
replicas: 1
template:
spec:
containers:
- image: milroy1/kf-testing:lammps-focal-openmpi-4.1.2-flux
imagePullPolicy: Always
name: mpi-launcher
command:
- bash
- -cx
- ". /etc/profile && mpirun --allow-run-as-root --mca orte_launch_agent /opt/view/bin/orted --mca plm_rsh_agent rsh -x PATH -x LD_LIBRARY_PATH -np 2 --map-by socket lmp -v x 4 -v y 2 -v z 2 -in in.reaxc.hns -nocite"
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
- image: milroy1/kf-testing:amg-focal-openmpi-4.1.2-amd
imagePullPolicy: Always
name: mpi-launcher
command:
- bash
- -cx
- ". /etc/profile && time -p mpirun --allow-run-as-root --mca orte_launch_agent /opt/view/bin/orted --mca plm_rsh_agent rsh -x PATH -x LD_LIBRARY_PATH -np 2 --map-by numa --rank-by core --bind-to core amg -n 4 4 4 -P 2 1 1"
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
tolerations:
- key: "launcher"
operator: "Exists"
effect: "NoSchedule"
Worker:
replicas: 2
template:
metadata:
app: lammps+amg
labels:
app: lammps
spec:
containers:
- image: milroy1/kf-testing:lammps-focal-openmpi-4.1.2-flux
imagePullPolicy: Always
name: worker
lifecycle:
postStart:
exec:
command:
- bash
- -c
- "while ! bash -c \"</dev/tcp/localhost/22\" >/dev/null 2>&1; do sleep 0.1; done"
command:
- /usr/sbin/sshd
args:
- -De
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
- image: milroy1/kf-testing:amg-focal-openmpi-4.1.2-amd
imagePullPolicy: Always
name: worker
lifecycle:
postStart:
exec:
command:
- bash
- -c
- "while ! bash -c \"</dev/tcp/localhost/22\" >/dev/null 2>&1; do sleep 0.1; done"
command:
- /usr/sbin/sshd
args:
- -De
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
Can you give me a high level understanding of how the two containers should be interacting? It looks like amg is starting an ssh server, and I'm guessing something from the flux container is supposed to be able to interact with it - how do I test that?
Update: I'm seeing that there is a set of "Worker" containers and a set of "Launcher" containers and they appear to be the same. I'm fairly far on adding to the Flux Operator but I would want to know the relationship between these two sets "Worker" and "Launcher." If this is specific to the MPI Operator, I'm wondering if the FluxOperator just needs one set of the containers in a pod with the postStart lifecycle still?
This is technically done. We don't have good examples for the actual containers yet, but I'll work on this soon.
Double rainbow :rainbow: or double container? :thinking: ?
Today in the meeting @milroy mentioned a different kind of job with mpi and Pytorch (I think) and we want to be able to support this case. When we have a link to the files / containers / details I can work on this (and looking forward to it!)