ssh: Could not resolve hostname mpic-worker-0: Temporary failure in name resolution
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
ssh: Could not resolve hostname mpic-master.mpic: Temporary failure in name resolution
ssh: Could not resolve hostname mpic-worker-2: Temporary failure in name resolution
ssh: Could not resolve hostname mpic-worker-1: Temporary failure in name resolution
Unable to execute mpiexec because of failure of name resolution of workers dns.Kindly guide me what is wrong and what need to checked
kubectl -n mpi get po NAME READY STATUS RESTARTS AGE mpic-master 2/2 Running 0 14m mpic-worker-0 1/1 Running 0 14m mpic-worker-1 1/1 Running 0 14m mpic-worker-2 1/1 Running 0 14m
kubectl -n $KUBE_NAMESPACE exec -it $MPI_CLUSTER_NAME-master /bin/bash root@mpic-master:/# cat /kube-openmpi/generated/hostfile mpic-master.mpic mpic-worker-0 mpic-worker-1 mpic-worker-2
mpirun --allow-run-as-root --display-map -n 1 -npernode 1 --hostfile /kube-openmpi/generated/hostfile -- hostname
ssh: Could not resolve hostname mpic-worker-0: Temporary failure in name resolution
ORTE was unable to reliably start one or more daemons. This usually is caused by:
not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default
lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities.
the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use.
compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type.
an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements).
ssh: Could not resolve hostname mpic-master.mpic: Temporary failure in name resolution ssh: Could not resolve hostname mpic-worker-2: Temporary failure in name resolution ssh: Could not resolve hostname mpic-worker-1: Temporary failure in name resolution