Closed bdevcich closed 10 months ago
The ports for all the rabbit containers should be the same for a workflow. Then, the compute nodes just need to know how to contact their local rabbit. One potential solution is to have site admins add a dns entry on each compute node for rabbit.local
that resolves to the connected rabbit. Then rabbit-local:PORT
would be a simple way for a compute node application to communicate with the container running on the rabbit.
Manual ports can be opened by supplementing the worker PodSpec
in a container profile with ports
and by using a hard-coded value here and on the compute node.
containers:
- name: example-mpi-webserver
ports:
- containerPort: 2000
hostPort: 2000
Since these ports are opened on the host (i.e. NNF node), only 1 port can be open at a time, which means that only 1 container workflow can be active per container profile with the manual approach.
Once the container is running on the NNF node, the compute nodes can hit the port via <NNF_NODE_IP>:<PORT>
.
The IP of the NNF node can be obtained by running kubectl get nodes -o wide
. For the full solution, it is recommended that each compute node contains a local rabbit host in /etc/hosts
to make these easy for applications running on the compute node.
From https://github.com/NearNodeFlash/NearNodeFlash.github.io/tree/containers-communication-update/docs/rfcs/0002#compute-to-rabbit-communication: