docker-archive / classicswarm

Swarm Classic: a container clustering system. Not to be confused with Docker Swarm which is at https://github.com/docker/swarmkit
Apache License 2.0
5.75k stars 1.08k forks source link

Cannot mount volumes into containers launched on Mesos backed Swarm #2386

Closed jhart99 closed 6 years ago

jhart99 commented 8 years ago

If a container is launched with Swarm using the Mesos backend, host volumes(-v) do not mount.

The setup is as follows.
CoreOS 1010.5.0 using Docker 1.10.3. Mesos 0.28.2 with mesos and docker containerizers Zookeeper 3.5.1-alpha backed data store Swarm 1.2.3

Everything communicates fine and I am able to launch containers onto Mesos slaves.

$ docker -H swarm.sky.vogt.local:4000 run --cpu-shares 1 -m 1G ubuntu /bin/bash -c "echo Hello World"
Hello World

However if I try to mount a host volume (which is present on every host on the cluster) I get this:

docker -H swarm.sky.vogt.local:4000 run --cpu-shares 1 -m 1G -v /mnt:/mnt ubuntu /bin/bash -c "ls /mnt"
mesos

While on the host I get this:

mesos
store
work

Looking at the container vis docker inspect it looks like something in the mesos driver is overriding the mount:

        "Mounts": [
            {
                "Source": "/var/lib/mesos/slaves/c7434bf0-1b8c-43eb-b626-fcd5689a97c1-S0/frameworks/c7434bf0-1b8c-43eb-b626-fcd5689a97c1-0025/executors/a69a9cdffcc9/runs/f7d333b8-f0bb-4940-8199-d70db9407735",
                "Destination": "/mnt/mesos/sandbox",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],

I apologize if this is a duplicate, but I couldn't find a reference to this behavior anywhere.

Spritekin commented 7 years ago

@jhart99 Hi Jonathan, did you find a solution for this problem? I have exactly the same problem right now and the alternative to run as privileged is not a viable option (and swarm won't let it anyway).

Luck!

jhart99 commented 7 years ago

In the end, I gave up on trying to get this to work. I was able to get this to work several other ways but the particular combination of docker swarm with a mesos backend never worked for me. I think the issue works like this. Mesos wants everything to run in a sandbox and external mounts are not allowed. Swarm would need to communicate a persistent volume resource (with a mount or path) to Mesos in order for this to work I think.

Here are the things I did to get around this. Marathon or Chronos on Mesos can run docker images just fine with external mounts. Eventually I just migrated to Kubernetes and that is what we are using today.

Spritekin commented 7 years ago

@jhart99 I did a quick hack to swarm yerterday. If interested aff this to cluster/mesos/task/task.go in the Build function:

    // Handle mounting volumes
    for _, volumebind := range t.config.HostConfig.Binds {
        // Values come as hostpath:containerpath[:mode]
        // example /host/path:/container/path:rw
        bindinfo := strings.Split(volumebind,":")
        hostpath := bindinfo[0]
        containerpath := bindinfo[1]
        bindmode := mesosproto.Volume_RW.Enum()
        if len(bindinfo) > 2 && bindinfo[2]=="ro" {
            bindmode = mesosproto.Volume_RO.Enum()
        }

        t.Container.Volumes = append(t.Container.Volumes, &mesosproto.Volume{
                    Mode:          bindmode,
                    HostPath:      &hostpath,
                    ContainerPath: &containerpath,
                })
    }

It allows mounting a folder. In this case my folder is a mounted NFS folder so any data I write is persisted. The folder is mounted in all the nodes in the cluster so I don't care where the job is started it will always save correctly.

I had plenty of problems using swarm on mesos but I have hacked my way through most of them. My swarm allows privileged containers, fractions of cpus, job persistence (i.e. not losing jobs if mesos leaders or swarm master crash) and now volume mounting. All my solutions are around as issues or PRs but they never got accepted for some reason.

The only big problem I have with Swarm on Mesos right now is when it crashes on heavy loads (more then 100 slaves and 1000 jobs) then it might take long to restart.... starts flapping for an hour until it stabilises. But other than that it works ok.

nishanttotla commented 6 years ago

Closing due to no activity and https://github.com/docker/swarm/pull/2853