kubeflow / mpi-operator

Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
https://www.kubeflow.org/docs/components/training/mpi/
Apache License 2.0
430 stars 216 forks source link

how could mpijob of mpi operator worker get the hostname of launcher #641

Closed Oneal65 closed 4 months ago

Oneal65 commented 4 months ago

I want to add a plug-in in worker, so how could mpijob of mpi operator worker get the hostname of launcher

such as: in the launcher etc/mpi/hostfile pi-worker-0.pi.default.svc slots=1 pi-worker-1.pi.default.svc slots=1

alculquicondor commented 4 months ago

what kind of plugin are you referring to?

For now, only Intel and MPICH know the hostname of the launcher, because they need it. But in OpenMPI, workers don't generally need to know who the launcher is.

Oneal65 commented 4 months ago

what kind of plugin are you referring to?

For now, only Intel and MPICH know the hostname of the launcher, because they need it. But in OpenMPI, workers don't generally need to know who the launcher is.

Thanks for your reply. the plugin in worker is used to send message to launcher by gprc or http but not mpi, it's used for other applications not for mpi program.