bcgsc / orca

:whale: Genomics Research Container Architecture
http://www.bcgsc.ca/services/orca
GNU General Public License v3.0
48 stars 13 forks source link

Migrate containers #33

Open tmozgach opened 6 years ago

tmozgach commented 6 years ago

Problem: how to migrate a running container from one worker to another worker. Flocker:? https://github.com/ClusterHQ/flocker

sjackman commented 6 years ago

Note that the container does not have to be running. We can stop the container when the user logs out. It's possible, for example, to use docker commit to save the a stopped container to an image, transfer that image to a different host, and fire a new container up at the new host from the saved image. I'd like to know if there's a tool to automate that process. Does flocker do that? See https://stackoverflow.com/questions/28734086/how-to-move-docker-containers-between-different-hosts

sjackman commented 6 years ago

Ah, you can also use

docker export CONTAINER | ssh new_host docker import -

https://stackoverflow.com/a/28751780/423153

tmozgach commented 6 years ago

@sjackman what is exact issue arises? Why do we need to migrate it? When a server crashes? Do we need that for HackSeq?

sjackman commented 6 years ago

That's one possible use case, but not the one that I'm thinking of. The use case is when a user ssh to ORCA, being able to move their container to whichever machine is currently the least busy, to achieve better load balancing. Currently the container has to run on whichever machine they were using previously.

tmozgach commented 6 years ago

@sjackman Found the interesting feature of Docker: https://medium.com/@tigranbs/container-is-live-ok-lets-move-it-1022abcb6250 We can try above method if the migration of the container is issue. =)

sjackman commented 6 years ago

That's quite cool! Keep in mind though that we don't need to migrate a running container, only a stopped one. That should be much simpler, and should be possible using docker export and docker import.

tmozgach commented 6 years ago

@sjackman,

`docker export` does not export the contents of volumes associated with the container.
In order to move a container with its associated data volume you can use `Flocker ` https://clusterhq.com/flocker/introduction/

So, we need to migrate manually the data too. So, the above option may be easier.

sjackman commented 6 years ago

The data volumes will be mounted on a network file system, so don't need to be migrated between machines.

sjackman commented 6 years ago

Plan to migrate a stopped container: docker export serializes a stopped container. docker import imports a serialized container. As an experiment, can you try… docker run to create a stopped container. docker export to serialize it to disk. docker rm to remove the stopped container. Copy the serialized Docker container to another system. docker import to import the serialized container. docker start to start the stopped container. It'd be great if there were a tool that automates that process.

suujia commented 6 years ago

Option 1 -- save and load

1. Save the docker image into archive:
docker save image_name > image_name.tar

2. copy on another machine

3. on that other docker machine, run docker load in a following way:
cat image_name.tar | docker load

*** save command saves whole image with history and metadata, while export command exports only files structure (without history or metadata)

Option 2 -- flocker (popular volume plugin)

*** volumes follow the containers when they move between different hosts in the cluster

flocker can be intergrated with kubernetes. -- flocker volume mounted into a pod and can be reattached to the node Pod is scheduled. This also offers an option to share files between certain containers. Kubernetes volumes offer explicit lifetime. 

Option 3 -- export and import

1. You can export a containet by running
docker export <container-name> -o container.tar

2. You then can copy the tar into a different machine and import it by running
docker import container.tar <image-name>

***The export does not export mounted volumes as part of the tarball

Option 4 -- commit as image and run

Copy the container data directory (/var/lib/mysql, in this case) to the new path (lets say ./db/data).
Remove the previous container and its volumes.
Create a new container mounting local data volume (-v $(pwd)/db/data:/var/lib/mysql).

http://blog.diovani.com/technology/2017/06/24/moving-docker-containers-data.html
You can commit the changes in your container to an image with docker commit, move the image onto a new host, and then start a new container with docker run. This will preserve any data that your application has created inside the container.

Option 5 -- multi-host persistence (shared, distributed storage) & consistent namespace

Shared filesystems, such as Ceph, GlusterFS, Network File System (NFS) can be used to configure a distributed filesystem on each host running Docker containers. 
-- creating a consistent naming convention and unified namespace, all running containers will have access to the underlying durable storage backend, irrespective of the host from which they are deployed.

will see which one is best / most feasible https://thenewstack.io/methods-dealing-container-storage/

sjackman commented 6 years ago

If Option 5 multi-host persistence works over NFS, that may be the simplest solution. I'd like to know whether there's a tool to automate Option 3 export and import. At first glance, it seems like the purpose of Flocker is to migrate volumes and may not be relevant to us, since we're using data volumes on NFS. Our containers on the hand are stored on a local file system. Thanks for this research, Susie!

suujia commented 6 years ago
  1. hmm there doesn't seem to be a tool specifically to import and export containers. Perhaps we can write a script that checks:
  1. If we took the multi-host persistence over NFS approach, we can revive a container with stored user data in the less occupied server every time the user logs in. Perhaps we can even use ssh server1 uptime; ssh server2 uptime and operate using bash script. If container is stopped, then remove it and start a new one and bind the data. I found some tools that we may be viable:
https://github.com/ContainX/docker-volume-netshare
"Mount NFS v3/4, AWS EFS or CIFS inside your docker containers. This is a docker plugin which enables these volume types to be directly mounted within a container."
- this may be required to share data with NFS (or use alternative volume plugins )
https://pkolano.github.io/projects/ballast.html
- cool ssh load balancing tool by nasa for reference (very advanced though, for high volume of users - makes predictions and can even perform user-specific balancing) 
https://github.com/rancher/convoy
"Before volume plugin, the only way to reuse the volume is using host bind mount feature of Docker, as docker run -v /host_path:/container_path, then maintain the content of the volume at /host_path. You can also use --volume-from but that would require original container still exists on the same host.

features: snapshot/backup/restore. So user would able to migrate the volumes between the hosts, share the same volume across the hosts, make scheduled snapshots of as well as recover to previous version of volume."

- it's a volume plugin that supports NFS. Similar to flocker (except flocker does not support NFS)
sjackman commented 6 years ago

Here's a few interesting posts on this topic: https://stackoverflow.com/questions/28734086/how-to-move-docker-containers-between-different-hosts https://github.com/rancher/rancher/issues/438 https://circleci.com/blog/checkpoint-and-restore-docker-container-with-criu/

Note that checkpoint/restore and CRIU are needed to migrate running containers, but we're primarily interested in migrating stopped containers (at least for now). For stopped containers we can use export/import.

Could you try implementing the script for Option 3 -- export and import ?