volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.21k stars 964 forks source link

plugin ssh and mpi for HPC calculation for engine on earthquake #1246

Closed vot4anto closed 3 years ago

vot4anto commented 3 years ago

/kind feature

Environment:

I want to use volcano as scheduler for our engine calculator for earthquakes. The communication of the cluster engine when we use VM or baremetal hosts is made by ssh

I see that there are mpi plugin and also ssh plugin, but unfortunately I can't find any docs on what use these plugins in a deployment yaml. What i need is to understand in which way that plugin works to communicate from master to worker, look the follow example:

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: lm-mpi-job
spec:
  minAvailable: 3
  schedulerName: volcano
  plugins:
    ssh: []
    svc: []
  tasks:
    - replicas: 1
      name: mpimaster
      policies:
        - event: TaskCompleted
          action: CompleteJob
      template:
        spec:
          containers:
            - command:
                - /bin/sh
                - -c
                - |
                  sleep 10;
                  cat /etc/volcano/mpiworker.host | tr "\n" ","
                  MPI_HOST=`cat /etc/volcano/mpiworker.host | tr "\n" ","`;
                  mkdir -p /var/run/sshd; /usr/sbin/sshd;
                  mpiexec --allow-run-as-root --host ${MPI_HOST} -np 2 mpi_hello_world ;
                  sleep 100;
              image: volcanosh/example-mpi:0.0.1
              name: mpimaster
              ports:
                - containerPort: 22
                  name: mpijob-port
              workingDir: /home
          restartPolicy: OnFailure
    - replicas: 2
      name: mpiworker
      template:
        spec:
          containers:
            - command:
                - /bin/sh
                - -c
                - |
                  mkdir -p /var/run/sshd; /usr/sbin/sshd -D;
              image: volcanosh/example-mpi:0.0.1
              name: mpiworker
              ports:
                - containerPort: 22
                  name: mpijob-port
              workingDir: /home
          restartPolicy: OnFailure

In this example the user is root, but is ti possible to use a different user for ssh plugin to can ssh to worker from master? Because on our image container we don't use user root but we need ssh connection from master to worker like open mpi And mpi plugin works in the same way? I find only a PR but no documentation on site volcano.sh or github available

Thanks

k82cn commented 3 years ago

In this example the user is root, but is ti possible to use a different user for ssh plugin to can ssh to worker from master?

If it's not a root user, the related files (e.g. publich/private key) will be created in a different directory; that's ok to copy them to user's home.

vot4anto commented 3 years ago

in which folder the related files are created ? Not in the .ssh folder on the home of the user? I create and set USER test on docker image but for the user test is not possible to do ssh to the worker instead it is possible for user root.

shinytang6 commented 3 years ago

in which folder the related files are created ? Not in the .ssh folder on the home of the user? I create and set USER test on docker image but for the user test is not possible to do ssh to the worker instead it is possible for user root.

Maybe you can configure the ssh plugin like that?

plugins:
    ssh: ["--ssh-key-file-path=/home/user/.ssh"]
    svc: []
vot4anto commented 3 years ago

I will try it immediately, thanks for the hint. But is there a documentation of that option of ssh plugins so i can't disturb opening an issue?

shinytang6 commented 3 years ago

I will try it immediately, thanks for the hint. But is there a documentation of that option of ssh plugins so i can't disturb opening an issue?

As far as I know, there are few examples of job plugins at present, I don’t know if I have missed anything.

l can help add some examples of job plugins lately.

vot4anto commented 3 years ago

in which folder the related files are created ? Not in the .ssh folder on the home of the user? I create and set USER test on docker image but for the user test is not possible to do ssh to the worker instead it is possible for user root.

Maybe you can configure the ssh plugin like that?

plugins:
    ssh: ["--ssh-key-file-path=/home/user/.ssh"]
    svc: []

Your suggestion works like a charm for my case. But is there a way to have the IP of the hosts instead that hostname? Or I have to do a reverse resolution at startup of container?

k82cn commented 3 years ago

But is there a way to have the IP of the hosts instead that hostname? Or I have to do a reverse resolution at startup of container?

hm... it's hard to know the IP address before pod start; so hostname is a better solution for now. Is there any case that IP is required?

vot4anto commented 3 years ago

Because the master and worker of our HPC infrastructure use zmq to communicate each other and zmq have same issue with tcp connect that can be easy solved using IP instead of hostname in configuration files, for example: https://stackoverflow.com/questions/21169031/zmq-socket-connect-timeout

I can install on container host or dig to do reverse resolution of hostname, I have to see the better solution for the size of container.

k82cn commented 3 years ago

Because the master and worker of our HPC infrastructure use zmq to communicate each other and zmq have same issue with tcp connect that can be easy solved using IP instead of hostname in configuration files, for example: https://stackoverflow.com/questions/21169031/zmq-socket-connect-timeout

That's interesting!

@wpeng102 , @Thor-wl , please help to investigate this scenario :)

vot4anto commented 3 years ago

It will be fantastic to have help to investigate the use of volcano with our engine. Please contact me in any way you want. Do you attend the European weekly meeting ?

wpeng102 commented 3 years ago

For volcano, it will do the following things: 1) create job in apiserver(create pod hosts file) 2) create pod in apiserver 3) schedule pod to node

Then, kubelet will start pod on node (assign ip for pod). It is hard to no know the pod ip when volcano scheduling pods. Maybe you can add init container for the master and worker pods, which do something like exchange pod ip for each other.

Thor-wl commented 3 years ago

It will be fantastic to have help to investigate the use of volcano with our engine. Please contact me in any way you want. Do you attend the European weekly meeting ?

Yes, Volcano Eruopean weekly meeting will be started and @k82cn or @william-wang will hold the meeting

Thor-wl commented 3 years ago

/assign @wpeng102 @Thor-wl

vot4anto commented 3 years ago

For volcano, it will do the following things:

  1. create job in apiserver(create pod hosts file)
  2. create pod in apiserver
  3. schedule pod to node

Then, kubelet will start pod on node (assign ip for pod). It is hard to no know the pod ip when volcano scheduling pods. Maybe you can add init container for the master and worker pods, which do something like exchange pod ip for each other.

Yes, I will do that. It is possible also using env variables, something like that to set the IP on master? env:

vot4anto commented 3 years ago

It will be fantastic to have help to investigate the use of volcano with our engine. Please contact me in any way you want. Do you attend the European weekly meeting ?

Yes, Volcano Eruopean weekly meeting will be started and @k82cn or @william-wang will hold the meeting

Great, I will attend with pleasure

vot4anto commented 3 years ago

I discovery that there is a misconfiguration on network side of the pods that are created. in /etc/hosts files there are one entry that is different from the hostname that is set for the pods. Follow one example:

/k8s$ kubectl get pods
NAME               READY   STATUS    RESTARTS   AGE
master             0/1     Error     0          15h
oqjob-oqmaster-0   1/1     Running   0          57s

/k8s$kubectl exec -it oqjob-oqmaster-0 -- bash
openquake@oqjob-oqmaster-0:~$hostname
oqjob-oqmaster-0

openquake@oqjob-oqmaster-0:~$ cat /etc/hosts
 Kubernetes-managed hosts file.
127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.244.1.72 oqjob-oqmaster-0.oqjob.default.svc.cluster.local    oqjob-oqmaster-0

openquake@oqjob-oqmaster-0:~$ host oqjob-oqmaster-0
Host oqjob-oqmaster-0 not found: 2(SERVFAIL)

openquake@oqjob-oqmaster-0:~$ cat /etc/volcano/oqmaster.host
oqjob-oqmaster-0.oqjob

openquake@oqjob-oqmaster-0:~$ host oqjob-oqmaster-0.oqjob
oqjob-oqmaster-0.oqjob.default.svc.cluster.local has address 10.244.1.72

openquake@oqjob-oqmaster-0:~$ host oqjob-oqmaster-0   
Host oqjob-oqmaster-0 not found: 2(SERVFAIL)

As you can see the hostname of the pod is not unique and not equal to the value of /etc/volcano/oqmaster.host and so the reverse dns is not work as aspect. At the last the yaml of the job:

metadata:
  name: oqjob
spec:
  minAvailable: 3
  schedulerName: volcano
  plugins:
    ssh: ["--ssh-key-file-path=/home/openquake/.ssh"]
    svc: []
    env: []
  tasks:
    - replicas: 1
      name: oqmaster
      policies:
        - event: TaskCompleted
          action: CompleteJob
      template:
        spec:
          containers:
            - command:
                - /bin/sh
                - -c
                - |
                  sudo mkdir -p /var/run/sshd; sudo /usr/sbin/sshd ;
              image: openquake/engine:exp
              imagePullPolicy: Always
              name: master
              #resources:
              #  limits:
              #    memory: "8Gi"
              #    cpu: "8"
              #  requests:
              #    memory: "4Gi"
              #    cpu: "4"
              ports:
              workingDir: /home/openquake
          restartPolicy: OnFailure
    - replicas: 2
      name: oqworker
      template:
        spec:
          containers:
            - command:
                - /bin/sh
                - -c
                - |
                  sudo mkdir -p /var/run/sshd; sudo /usr/sbin/sshd -D;
              image: openquake/engine:exp
              imagePullPolicy: Always
              name: worker
              workingDir: /home/openquake
          restartPolicy: OnFailure
vot4anto commented 3 years ago

Sorry, do you have some notice about the issue on hostname?

Thor-wl commented 3 years ago

Sorry, do you have some notice about the issue on hostname?

Well, noting but just follow common rules

vot4anto commented 3 years ago

And which I can say to volcano to set the correct hostname on the pod? I can pass an extra args?

Il giorno gio 21 gen 2021 alle ore 05:01 WuLei notifications@github.com ha scritto:

Sorry, do you have some notice about the issue on hostname?

Well, noting but just as follow common rules

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/volcano-sh/volcano/issues/1246#issuecomment-764224025, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODVSKONLBUKWLKDCPUTS26RJLANCNFSM4V3N2Z7Q .

-- Antonio Ettorre about.me/antonio.ettorre

shinytang6 commented 3 years ago

And which I can say to volcano to set the correct hostname on the pod? I can pass an extra args? Il giorno gio 21 gen 2021 alle ore 05:01 WuLei notifications@github.com ha scritto:

Volcano will set the pod.Spec.Hostname to podName and pod.Spec.Subdomain to jobName by default, so the address of the pod should be it's FQDN(the output of hostname -f).

Tips: You can also explicitly specify the pod.Spec.Hostname and pod.Spec.Subdomain.

vot4anto commented 3 years ago

I try to set the name, so i can see if also on hosts files the entries are the rights one.

Il sab 23 gen 2021, 11:04 shinytang6 notifications@github.com ha scritto:

And which I can say to volcano to set the correct hostname on the pod? I can pass an extra args? Il giorno gio 21 gen 2021 alle ore 05:01 WuLei notifications@github.com ha scritto: … <#m-8659299505776582737>

Volcano will set the pod.Spec.Hostname to podName & pod.Spec.Subdomain to jobName by default, the address of the pod should be it's FQDN(the output of hostname -f).

You can also explicitly specify the pod.Spec.Hostname & pod.Spec.Subdomain .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/volcano-sh/volcano/issues/1246#issuecomment-765899122, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODSBRHPV65TFKE4CJQLS3KNKZANCNFSM4V3N2Z7Q .

vot4anto commented 3 years ago

I do as you suggest but the hosts files is still wrong and also the hostnam and sudomain are not as describe. follow the result

oq@oqjob-master-0:/etc/volcano$ more workers.host 
oqjob-workers-0.oqjob
oqjob-workers-1.oqjob
oq@oqjob-master-0:/etc/volcano$ more master.host  
oqjob-master-0.oqjob
oq@oqjob-master-0:/etc/volcano$ host oqjob-master-0.oqjob
oqjob-master-0.oqjob.default.svc.cluster.local has address 10.244.1.90
oq@oqjob-master-0:/etc/volcano$ cat /etc/hosts
#Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.244.1.90     oqjob-master-0.oqjob.default.svc.cluster.local  oqjob-master-0

as you can see here the shortname is oqjob-master-0 and not oqjob-master-0.oqjob

Follow the definition of the job:

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: oqjob
spec:
  minAvailable: 3
  schedulerName: volcano
  plugins:
    ssh: ["--ssh-key-file-path=/home/openquake/.ssh"]
    svc: []
    env: []
  tasks:
    - replicas: 1
      name: master
      policies:
        - event: TaskCompleted
          action: CompleteJob
      template:
        spec:
          containers:
            - command:
                - /bin/sh
                - -c
                - |
                  sleep 3000;
              image: openquake/engine:exp
              imagePullPolicy: Always
              name: master
              hostname: master
              subdomain: cluster.local
              #resources:
              #  limits:
              #    memory: "8Gi"
              #    cpu: "8"
              #  requests:
              #    memory: "4Gi"
              #    cpu: "4"
              ports:
                - containerPort: 8800
                  name: oqjob-port
              workingDir: /home/openquake
          restartPolicy: OnFailure
    - replicas: 2
      name: workers
      template:
        spec:
          containers:
            - command:
                - /bin/sh
                - -c
                - |
                  sudo mkdir -p /var/run/sshd; sudo /usr/sbin/sshd -D;
              image: openquake/engine:exp
              imagePullPolicy: Always
              name: worker
              subdomain: cluster.local
              #resources:
              #  limits:
              #    memory: "8Gi"
              #    cpu: "8"
              #  requests:
              #    memory: "4Gi"
              #    cpu: "4"
              ports:
              ports:
                - containerPort: 8800
                  name: oqjob-port
              workingDir: /home/openquake
          restartPolicy: OnFailure
huone1 commented 3 years ago

@vot4anto a headless seveice will be created when job with plugin svc apply; in container, we get all pod's ip and domain name by nslookup service domain--- "nslookup jobname.default.svc.cluster.local" image

vot4anto commented 3 years ago

Can you also check the /etc/hosts files? In my case the entries here doesn't reflect the dns

Il giorno ven 29 gen 2021 alle ore 03:48 huone1 notifications@github.com ha scritto:

@vot4anto https://github.com/vot4anto a headless seveice will be created when job witch plugin svc apply; in container, we get all pod's ip and domain name by nslookup service domain--- "nslookup .default.svc.cluster.local" [image: image] https://user-images.githubusercontent.com/71266853/106225100-67358100-621f-11eb-8e75-dbce6d3342c6.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/volcano-sh/volcano/issues/1246#issuecomment-769537406, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODSPPVT2AVJLJUNXNK3S4IOWTANCNFSM4V3N2Z7Q .

-- Antonio Ettorre about.me/antonio.ettorre

wpeng102 commented 3 years ago

In my case, use nslookup can parse hostname in /etc/hosts is ok. en... It maybe k8s network module issue?

image

In my understanding, if use nslookup collect all workers ip, is is could work for zmp?

vot4anto commented 3 years ago

I use for testing kind installation of kubernets at version kindest/node:v1.19.4. Can I try with different version?

Il giorno ven 29 gen 2021 alle ore 08:54 Peng Wang notifications@github.com ha scritto:

In my case, use nslookup can parse hostname in /etc/hosts is ok. en... It maybe k8s network module issue?

[image: image] https://user-images.githubusercontent.com/10152842/106245870-3158c280-6248-11eb-9d80-b2a33589c1fd.png

In my understanding, if use nslookup collect all workers ip, is is could work for zmp?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/volcano-sh/volcano/issues/1246#issuecomment-769642362, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODTK3NMWD2UHNA4EANDS4JSS3ANCNFSM4V3N2Z7Q .

-- Antonio Ettorre about.me/antonio.ettorre

vot4anto commented 3 years ago

Follow my value:

cat /etc/hosts

Kubernetes-managed hosts file.

127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet fe00::0 ip6-mcastprefix fe00::1 ip6-allnodes fe00::2 ip6-allrouters 10.244.1.92 oqjob-master-0.oqjob.default.svc.cluster.local oqjob-master-0 openquake@oqjob-master-0:~$ nslookup oqjob-master-0 Server: 10.96.0.10 Address: 10.96.0.10#53

** server can't find oqjob-master-0: SERVFAIL

openquake@oqjob-master-0:~$ nslookup oqjob.default.svc.cluster.local Server: 10.96.0.10 Address: 10.96.0.10#53

Name: oqjob.default.svc.cluster.local Address: 10.244.1.94 Name: oqjob.default.svc.cluster.local Address: 10.244.1.92 Name: oqjob.default.svc.cluster.local Address: 10.244.1.93

openquake@oqjob-master-0:~$ nslookup oqjob.default.svc.cluster.local Server: 10.96.0.10 Address: 10.96.0.10#53

Name: oqjob.default.svc.cluster.local Address: 10.244.1.93 Name: oqjob.default.svc.cluster.local Address: 10.244.1.94 Name: oqjob.default.svc.cluster.local Address: 10.244.1.92

Il giorno ven 29 gen 2021 alle ore 09:48 Antonio Ettorre vot4anto@gmail.com ha scritto:

I use for testing kind installation of kubernets at version kindest/node:v1.19.4. Can I try with different version?

Il giorno ven 29 gen 2021 alle ore 08:54 Peng Wang < notifications@github.com> ha scritto:

In my case, use nslookup can parse hostname in /etc/hosts is ok. en... It maybe k8s network module issue?

[image: image] https://user-images.githubusercontent.com/10152842/106245870-3158c280-6248-11eb-9d80-b2a33589c1fd.png

In my understanding, if use nslookup collect all workers ip, is is could work for zmp?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/volcano-sh/volcano/issues/1246#issuecomment-769642362, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODTK3NMWD2UHNA4EANDS4JSS3ANCNFSM4V3N2Z7Q .

-- Antonio Ettorre about.me/antonio.ettorre

-- Antonio Ettorre about.me/antonio.ettorre

wpeng102 commented 3 years ago

we use kubeadm to install k8s cluster.

https://kubernetes.io/docs/tasks/tools/

vot4anto commented 3 years ago

which release of k8s do you use?

Thanks

Il giorno ven 29 gen 2021 alle ore 10:49 Peng Wang notifications@github.com ha scritto:

we use kubeadm to install k8s cluster.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/volcano-sh/volcano/issues/1246#issuecomment-769699805, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODUT4OEPEIEA3ORMZH3S4KAAXANCNFSM4V3N2Z7Q .

-- Antonio Ettorre about.me/antonio.ettorre

wpeng102 commented 3 years ago

which release of k8s do you use? Thanks Il giorno ven 29 gen 2021 alle ore 10:49 Peng Wang notifications@github.com ha scritto: we use kubeadm to install k8s cluster. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1246 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANN2ODUT4OEPEIEA3ORMZH3S4KAAXANCNFSM4V3N2Z7Q . -- Antonio Ettorre about.me/antonio.ettorre

kubelet --version
Kubernetes v1.18.2
stale[bot] commented 3 years ago

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

stale[bot] commented 3 years ago

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗