fabiolb / fabio

Consul Load-Balancing made simple
https://fabiolb.net
MIT License
7.27k stars 615 forks source link

Fabio running using Nomad system scheduler breaks Docker. #192

Closed michaelmcguinness closed 7 years ago

michaelmcguinness commented 7 years ago

I realise how unlikely the title to this issue seems but if there is an obvious error in my set up I can't spot it. I want to run Fabio as a Nomad managed service using the Nomad system scheduler (type = "system"). When I do then any subsequent pulls from our private Docker registry fails with the error failed to register layer: open /dev/mapper/docker-202:32-786433-35e363b33db58a87d6a55b19f3297715b9978052e70edec86f03b51af3e44455: no such file or directory From that point on I am not able to recover Docker.

Some details about our set up: Ubuntu 14-04 Kernel = 3.13.0-53-generic Docker = 1.12.2 Nomad = 0.5.0 Fabio = 1.3.4

I have a 3 x servers with 2 x clients. I am trying to run Fabio using the exec driver and the system scheduler. I am running Nomad as the root user on which I believe is required for the exec driver.

I do not see the issue if I run Fabio using the service scheduler. I do not see the issue if I run a Docker container using the system scheduler . I do not see the issue if I run another job (sleep binary) using the system scheduler. I do not see the issue if I run Fabio using the system scheduler but using the raw_exec driver.

Docker is using the LVM storage option but I see the same issue if I drop back to the devicemapper storage option.

Below is a repeatable test case. After that are copies of the job specs used in the test case.

  1. Go to Nomad user ubuntu@ip-10-75-70-27:~$ sudo su - nomad

  2. Software versions

    $ uname -a
    Linux ip-10-75-70-27 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
    $ docker --version
    Docker version 1.12.2, build bb80604
    $ nomad version
    Nomad v0.5.0
  3. Nomad running as root with no running jobs

    $ ps -ef | grep nomad
    root     17416     1  0 12:57 ?        00:00:00 /usr/local/bin/nomad agent -config /etc/nomad.d/config.json -rejoin -node=nomad_client_poc1
    consul   17540     1  0 12:57 ?        00:00:00 /usr/local/bin/consul agent -config-file /etc/consul.d/config.json -rejoin -node nomad_client_poc1
    root     17617  2602  0 12:58 pts/1    00:00:00 sudo su - nomad
  4. Demonstrate Docker pull

    $ docker pull dockerregistry.adm.myprivatecloud.net/bti1003:latest
    latest: Pulling from bti1003
    5a132a7e7af1: Pull complete
    fd2731e4c50c: Pull complete
    28a2f68d1120: Pull complete
    a3ed95caeb02: Pull complete
    87f9029820c8: Pull complete
    7582f6d126ab: Pull complete
    Digest: sha256:6d7379af49cc17cc8a0055e06c4cb8374e5be73fe42ce2e8f1abca013c50a62a
    Status: Downloaded newer image for dockerregistry.adm.myprivatecloud.net/bti1003:latest
  5. Remove pulled image

    $ docker rmi dockerregistry.adm.myprivatecloud.net/bti1003:latest
    Untagged: dockerregistry.adm.myprivatecloud.net/bti1003:latest
    Untagged: dockerregistry.adm.myprivatecloud.net/bti1003@sha256:6d7379af49cc17cc8a0055e06c4cb8374e5be73fe42ce2e8f1abca013c50a62a
    Deleted: sha256:3fee2600d434e469b6d4ac0e468bd988ebc105536886d6624dc9566577fcafbe
    Deleted: sha256:e5fcd939dd4a2a9b9543dea61ca90d2def7c92cd983108916895a39f239799b8
    Deleted: sha256:57bd7c9432ae86d63f2342e442eebd0f4dfc340ca61c6a4c7d702b17a315865f
    Deleted: sha256:0aaccda2aadfc70ab2248437568fd17f4e8860cf612cc4b7e154b97222dccf91
    Deleted: sha256:9dcfe19e941956c63860afee1bec2e2318f6fbd336bc523094ed609a9c437a01
    Deleted: sha256:6ff1ee6fc8a0358aeb92f947fb7125cd9e3d68c05be45f5375cb59b98c850b4d
    Deleted: sha256:56abdd66ba312859b30b5629268c30d44a6bbef6e2f0ebe923655092855106e8
  6. Run 'sleep' test job

    $ ps -ef | grep nomad.*executor
    root     17897 17416  6 13:00 ?        00:00:02 /usr/local/bin/nomad executor /var/nomad/alloc/87a48081-5d8e-b1b1-1538-a1e79a3f4152/sleep-task/sleep-task-executor.out
    nomad    18299 17619  0 13:07 pts/1    00:00:00 grep nomad.*executor
  7. Pull Docker image

    $ docker pull dockerregistry.adm.myprivatecloud.net/bti1003:latest
    latest: Pulling from bti1003
    5a132a7e7af1: Pull complete
    fd2731e4c50c: Pull complete
    28a2f68d1120: Pull complete
    a3ed95caeb02: Pull complete
    87f9029820c8: Pull complete
    7582f6d126ab: Pull complete
    Digest: sha256:6d7379af49cc17cc8a0055e06c4cb8374e5be73fe42ce2e8f1abca013c50a62a
    Status: Downloaded newer image for dockerregistry.adm.myprivatecloud.net/bti1003:latest
  8. Remove pulled image

    $ docker rmi dockerregistry.adm.myprivatecloud.net/bti1003:latest
    Untagged: dockerregistry.adm.myprivatecloud.net/bti1003:latest
    Untagged: dockerregistry.adm.myprivatecloud.net/bti1003@sha256:6d7379af49cc17cc8a0055e06c4cb8374e5be73fe42ce2e8f1abca013c50a62a
    Deleted: sha256:3fee2600d434e469b6d4ac0e468bd988ebc105536886d6624dc9566577fcafbe
    Deleted: sha256:e5fcd939dd4a2a9b9543dea61ca90d2def7c92cd983108916895a39f239799b8
    Deleted: sha256:57bd7c9432ae86d63f2342e442eebd0f4dfc340ca61c6a4c7d702b17a315865f
    Deleted: sha256:0aaccda2aadfc70ab2248437568fd17f4e8860cf612cc4b7e154b97222dccf91
    Deleted: sha256:9dcfe19e941956c63860afee1bec2e2318f6fbd336bc523094ed609a9c437a01
    Deleted: sha256:6ff1ee6fc8a0358aeb92f947fb7125cd9e3d68c05be45f5375cb59b98c850b4d
    Deleted: sha256:56abdd66ba312859b30b5629268c30d44a6bbef6e2f0ebe923655092855106e8
  9. Stop 'sleep' job

    $ ps -ef | grep nomad.*executor
    nomad    18239 17619  0 13:06 pts/1    00:00:00 grep nomad.*executor
  10. Start Fabio job

    $ ps -ef | grep nomad.*executor
    root     18262 17416 33 13:07 ?        00:00:04 /usr/local/bin/nomad executor /var/nomad/alloc/5729f45b-185c-fa7b-7b05-866a774b8c73/fabio-task/fabio-task-executor.out
    nomad    18299 17619  0 13:07 pts/1    00:00:00 grep nomad.*executor
  11. Pull docker image

    $ docker pull dockerregistry.adm.myprivatecloud.net/bti1003:latest
    latest: Pulling from bti1003
    5a132a7e7af1: Extracting [==================================================>] 65.69 MB/65.69 MB
    fd2731e4c50c: Download complete
    28a2f68d1120: Download complete
    a3ed95caeb02: Download complete
    87f9029820c8: Download complete
    7582f6d126ab: Download complete
    failed to register layer: open /dev/mapper/docker-202:32-786433-35e363b33db58a87d6a55b19f3297715b9978052e70edec86f03b51af3e44455: no such file or directory
  12. Fabio job dies (10 minutes later), from syslog

    Nov 23 13:17:56 ip-10-75-70-27 nomad[17416]: driver.exec: error destroying executor: 1 error(s) occurred:#012#012* 1 error(s) occurred:#012#012* failed to unmou
    nt shared alloc dir "/var/nomad/alloc/5729f45b-185c-fa7b-7b05-866a774b8c73/fabio-task/alloc": invalid argument
    Nov 23 13:17:57 ip-10-75-70-27 nomad[17416]: client: failed to destroy context for alloc '5729f45b-185c-fa7b-7b05-866a774b8c73': 2 error(s) occurred:#012#012* 1 error(s) occurred:#012#012* failed to remove the secret dir "/var/nomad/alloc/5729f45b-185c-fa7b-7b05-866a774b8c73/fabio-task/secrets": unmount: invalid argument#012* remove /var/nomad/alloc/5729f45b-185c-fa7b-7b05-866a774b8c73/fabio-task: directory not empty

From Docker log

time="2016-11-23T13:07:59.287783575Z" level=error msg="Error trying v2 registry: failed to register layer: open /dev/mapper/docker-202:32-786433-35e363b33db58a87d6a55b19f3297715b9978052e70edec86f03b51af3e44455: no such file or directory"
time="2016-11-23T13:07:59.287830271Z" level=error msg="Attempting next endpoint for pull after error: failed to register layer: open /dev/mapper/docker-202:32-786433-35e363b33db58a87d6a55b19f3297715b9978052e70edec86f03b51af3e44455: no such file or directory"

Fabio Job Spec

job "fabio-job" {
  region = "eu"
  datacenters = ["vpc-poc"]
  type = "system"
  update {
    stagger = "5s"
    max_parallel = 1
  }

  group "fabio-group" {
     ephemeral_disk {
      size    = "500"
    }
    task "fabio-task" {
      driver = "exec"
      config {
        command = "fabio-1.3.4-go1.7.3-linux_amd64"
      }

      artifact {
        source = "https://github.com/eBay/fabio/releases/download/v1.3.4/fabio-1.3.4-go1.7.3-linux_amd64"
     }
     logs {
    max_files = 2
    max_file_size = 5
      }
      resources {
        cpu = 500
        memory = 64
        network {
          mbits = 1

          port "http" {
            static = 9999
          }
          port "ui" {
            static = 9998
          }
        }
      }
    }
  }
}

Sleep Job Spec

job "sleep-job" {
  region = "eu"
  datacenters = ["vpc-poc"]
  type = "system"
  update {
    stagger = "5s"
    max_parallel = 1
  }

  group "sleep-group" {
     ephemeral_disk {
      size    = "500"
    }
    task "sleep-task" {
      driver = "exec"
      config {
        command = "/bin/sleep"
        args = ["1000"]
      }

     logs {
    max_files = 2
    max_file_size = 5
      }
      resources {
        cpu = 500
        memory = 64
        network {
          mbits = 1

        }
      }
    }
  }
}
magiconair commented 7 years ago

Thanks for the detailed description. My suspicion is that this isn't Fabio related but you've done testing around this. I'm currently on vacation and will start picking up Fabio issues as of next week again.

magiconair commented 7 years ago

@michaelmcguinness Did you make any progress with this? What do the fabio logs say?

michaelmcguinness commented 7 years ago

No. Running it as raw_exec to work around the issue. Would I be right in saying that you mean the logs exposed by

$ nomad logs -verbose -stderr <allocid>

That being the case there is nothing interesting there...

2016/12/17 15:52:58 [INFO] Version 1.3.4 starting
2016/12/17 15:52:58 [INFO] Go runtime is go1.7.3
2016/12/17 15:52:58 [INFO] Using routing strategy "rnd"
2016/12/17 15:52:58 [INFO] Using routing matching "prefix"
2016/12/17 15:52:58 [INFO] Setting GOGC=800
2016/12/17 15:52:58 [INFO] Setting GOMAXPROCS=1
2016/12/17 15:52:58 [INFO] Metrics disabled
2016/12/17 15:52:58 [INFO] consul: Connecting to "localhost:8500" in datacenter "vpc-poc"
2016/12/17 15:52:58 [INFO] Admin server listening on ":9998"
2016/12/17 15:52:58 [INFO] HTTP proxy listening on :9999
2016/12/17 15:52:58 [INFO] consul: Using dynamic routes
2016/12/17 15:52:58 [INFO] consul: Using tag prefix "urlprefix-"
2016/12/17 15:52:58 [INFO] consul: Watching KV path "/fabio/config"
2016/12/17 15:52:58 [INFO] consul: Health changed to #2443304
2016/12/17 15:52:58 [INFO] consul: Skipping service "_prometheus-node-exporter-http" since agent on node "dns_slave" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_prometheus-node-exporter-process" since agent on node "dns_slave" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_nomad-server-nomad-serf" since agent on node "nomad_server1" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_nomad-server-nomad-rpc" since agent on node "nomad_server1" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_nomad-server-nomad-serf" since agent on node "nomad_server2" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_nomad-server-nomad-rpc" since agent on node "nomad_server2" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_nomad-server-nomad-http" since agent on node "nomad_server2" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_nomad-server-nomad-serf" since agent on node "nomad_server3" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_prometheus-node-exporter-http" since agent on node "nomad_server3" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Skipping service "_prometheus-node-exporter-process" since agent on node "nomad_server3" is down: Agent not live or unreachable
2016/12/17 15:52:58 [INFO] consul: Manual config changed to #2409046
2016/12/17 15:52:58 [INFO] Updated config to
2016/12/17 15:52:58 [INFO] consul: Registered fabio with id "fabio-ip-10-75-70-74-9998"
2016/12/17 15:52:58 [INFO] consul: Registered fabio with address "10.75.70.74"
2016/12/17 15:52:58 [INFO] consul: Registered fabio with tags ""
2016/12/17 15:52:58 [INFO] consul: Registered fabio with health check to "http://[10.75.70.74]:9998/health"
2016/12/17 15:52:58 [INFO] consul: Health changed to #2443305

and then just repeated messages about Consul.

magiconair commented 7 years ago

If there is nothing in the fabio logs then that supports my suspicion that this is a docker and/or nomad issue. Maybe the way fabio interacts with docker or the way it shuts down triggers this. However, since you're running this with a single listener you could try to simulate this with a simple go program that runs a web server, then a reverse proxy, and then a reverse proxy that makes long polling http requests.

Below is a simple reverse proxy for testing. Store it in ~/gopath/src/fabiotest/main.go and then build with go build in that directory. Make sure you have set export GOPATH=~/gopath. When running you can test both endpoints with curl localhost:9998 and curl localhost:9999. Except for the long-polling outgoing connection to consul this is in essence the core of fabio. :)

You can use the following Dockerfile:

FROM scratch
ADD / fabiotest
EXPOSE 9998 9999
CMD ["/fabiotest"]

See how far you get with this.

package main

import (
    "flag"
    "fmt"
    "log"
    "net/http"
    "net/http/httputil"
    "net/url"
)

func main() {
    var proxyAddr, uiAddr, proxyURL string
    flag.StringVar(&proxyAddr, "proxy", ":9999", "host:port of the proxy")
    flag.StringVar(&uiAddr, "ui", ":9998", "host:port of the ui")
    flag.StringVar(&proxyURL, "proxyURL", "https://www.google.com/", "proxy url")
    flag.Parse()

    log.Println("fabiotest starting")

    go func() {
        u, err := url.Parse(proxyURL)
        if err != nil {
            log.Fatal("proxyURL:", err)
        }
        log.Println("proxy listening on", proxyAddr, "proxying", u)
        rp := httputil.NewSingleHostReverseProxy(u)
        if err := http.ListenAndServe(proxyAddr, rp); err != nil {
            log.Fatal("proxy:", err)
        }
    }()

    go func() {
        log.Println("UI listening on", uiAddr)
        http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
            fmt.Fprintln(w, "UI is OK")
        })
        if err := http.ListenAndServe(uiAddr, nil); err != nil {
            log.Fatal("ui:", err)
        }
    }()

    log.Println("Press CTRL-C to stop")
    select {}
}
michaelmcguinness commented 7 years ago

Thanks I'll give this a go and let you know the result. Actually I upgraded to Nomad 0.5.1 just in case but no change.

michaelmcguinness commented 7 years ago

This issue remains with this test code so as you suspected it is a more generic issue. I actually also tested it with the Consul binary and got the same thing. Thank you for your time on this. I'll head over to the Nomad issues page with it.

As a side note, what should have I been expecting from 'curl localhost:9999'. I am getting a 404. Also I noticed that you posted a Dockerfile. As the issue was specifically for the exec driver I was not sure what you wanted me to do with it.

michaelmcguinness commented 7 years ago

You may (or may not :)) be interested to know that this issue is something to do with the kernel version and the LVM storage driver implementation. Haven't quite figured it out but switching to AUFS makes the issue go away.

Thanks again for your attention and a great utility.

magiconair commented 7 years ago

I am interested and I'm glad you've figured it out. If you have a reference issue for nomad feel free to link it.

Thanks and merry christmas. Enjoy the holidays.

michaelmcguinness commented 7 years ago

I raised this up with Kelsey Hightower as it was his demo that made me look at it. Not for a fix but just for some info about his env. I'm not sure I am going to burn time raising it with Nomad as we are way behind here with our kernel revision (one of the many things on my to-do list). It seems to me that LVM may not be the strategic storage option for Docker so I think fixing forward by upgrading is the way to go.