Mirantis / virtlet

Kubernetes CRI implementation for running VM workloads
Apache License 2.0
741 stars 128 forks source link

VM not accessible from other VM #230

Closed miha-plesko closed 7 years ago

miha-plesko commented 7 years ago

I'm trying to run two VMs on DIND cluster:

I'm having hard time connecting the two. Is this supposed to be working?

Debugging

Cirros VM

In cirros I can see that DNS resolves correctly:

$ nslookup medo-db-service.default.svc.cluster.local
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      medo-db-service.default.svc.cluster.local
Address 1: 10.99.40.59 medo-db-service.default.svc.cluster.local

But then I'm unable to reach 10.99.40.59 (the MySQL VM) for some reason:

$ curl 10.99.40.59:8000/dashboard/
curl: (7) couldn't connect to host

$ curl medo-db-service.default.svc.cluster.local:8000/dashboard/
curl: (7) couldn't connect to host
Ubutnu Container

Everything works in Ubuntu Container:

$ nslookup medo-db-service.default.svc.cluster.local
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   medo-db-service.default.svc.cluster.local
Address: 10.99.40.59
$ curl 10.99.40.59:8000/dashboard/
<!DOCTYPE html>
<html lang="en">
<head>
...

$ curl medo-db-service.default.svc.cluster.local:8000/dashboard/
<!DOCTYPE html>
<html lang="en">
<head>
...
Host machine

I can access MySQL VM from my host machine as well if I use pod IP.

$ kubectl describe service medo-db-service
Name:           medo-db-service
Namespace:      default
Labels:         <none>
Selector:       case=label-medo-db
Type:           ClusterIP
IP:         10.99.40.59
Port:           tcp-3306    3306/TCP
Endpoints:      10.192.2.10:3306
Port:           tcp-8000    8000/TCP
Endpoints:      10.192.2.10:8000
Session Affinity:   None
No events.

$ curl 10.192.2.10:8000/dashboard/
<!DOCTYPE html>
<html lang="en">
<head>

Should you have any hint what settign am I missing, please let me know!

jellonek commented 7 years ago

Can you provide urls to VM images which you are using for both roles?

miha-plesko commented 7 years ago

I'm using OSv unikernels* (built with capstan tool) so giving you disk images will not help you much, but anyways:

http://x.k00.fr/virtlet

I also put the two of my kubernetes yamls there.

* I did try to use Ubutnu 16.04 Server image instead of unikernels, but couldn't run them on kubernetes. No error, just unable to see logs nor ssh.

miha-plesko commented 7 years ago

This problem is easily reproducible using two cirros images: you cannot ssh from one to another. But you can ssh from docker container to cirros either directly either thru service.

In other words, if you deploy these two cirros VMs to the cluster:

cirros-vm.yaml

apiVersion: v1
kind: Pod
metadata:
  name: cirros-vm
  annotations:
    kubernetes.io/target-runtime: virtlet
    scheduler.alpha.kubernetes.io/affinity: >
      {
        "nodeAffinity": {
          "requiredDuringSchedulingIgnoredDuringExecution": {
            "nodeSelectorTerms": [
              {
                "matchExpressions": [
                  {
                    "key": "extraRuntime",
                    "operator": "In",
                    "values": ["virtlet"]
                  }
                ]
              }
            ]
          }
        }
      }
spec:
  containers:
    - name: cirros-vm
      image: virtlet/image-service.kube-system/cirros

cirros-vm2.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cirros2-deployment
  namespace: default
spec:
  replicas: 1
  template:
    metadata:
      name: cirros2-pod
      labels:
        case: label-cirros2
      annotations:
        kubernetes.io/target-runtime: virtlet
        scheduler.alpha.kubernetes.io/affinity: >
          {
            "nodeAffinity": {
              "requiredDuringSchedulingIgnoredDuringExecution": {
                "nodeSelectorTerms": [
                  {
                    "matchExpressions": [
                      {
                        "key": "extraRuntime",
                        "operator": "In",
                        "values": ["virtlet"]
                      }
                    ]
                  }
                ]
              }
            }
          }
    spec:
      containers:
        - name: cirros2
          image: virtlet/image-service.kube-system/cirros
          ports:
          - name: ssh
            containerPort: 22
---
apiVersion: v1
kind: Service
metadata:
  name: cirros2-service
  namespace: default
spec:
  type: ClusterIP
  selector:
    case: label-cirros2
  ports:
  - name: tcp-22
    protocol: TCP
    port: 22
    targetPort: ssh

Then you cannot ssh from one to another:

$ kubectl exec -it virtlet-9l1fz -n kube-system -- /bin/bash -c "virsh list"
 Id    Name                           State
----------------------------------------------------
 1     aa65799e-59f4-4cd6-6943-672f1b02c5fa-cirros-vm running
 11    cefc0eda-f400-4c45-5599-7abdb748e122-miha-ubuntu-server running
 14    59e8d605-677d-4725-73a6-b8ff19a42f48-cirros2 running

$ kubectl exec -it virtlet-9l1fz -n kube-system -- /bin/bash -c "virsh console 1"
$ ssh cirros@10.192.2.9
ssh: Exited: Error connecting: No route to host
$ ssh cirros@cirros2-service.default.svc.cluster.local
ssh: Exited: Error connecting: Connection timed out

The very same commands work normally from within container:

$ kubectl run ubuntu-cont --image=ubuntu:16.04 -i --tty
/# ssh cirros@10.192.2.9
The authenticity of host '10.192.2.9 (10.192.2.9)' can't be established.
RSA key fingerprint is SHA256:dNMNiGFj0tZBqoD1PotcN8kFlKbFNgxRL33+QJgQX8k.
Are you sure you want to continue connecting (yes/no)?

/# ssh cirros@cirros2-service.default.svc.cluster.local
The authenticity of host 'cirros2-service.default.svc.cluster.local (10.102.87.89)' can't be established.
RSA key fingerprint is SHA256:dNMNiGFj0tZBqoD1PotcN8kFlKbFNgxRL33+QJgQX8k.
Are you sure you want to continue connecting (yes/no)?

@jellonek can you please confirm that this is a bug in virtlet networking and not in my environment? Btw, I really like what you guys at Mirantis do 👍 and would very much like to see this problem solved. My little goal is to get a set of unikernels running on Kubernetes and communicating one with each other. Perhaps, should you have any extra time, could we have a video call on this topic?

/cc @gberginc

ivan4th commented 7 years ago

It was an RNG problem :) The fix will be merged shortly.