bcgsc / orca

:whale: Genomics Research Container Architecture
http://www.bcgsc.ca/services/orca
GNU General Public License v3.0
48 stars 13 forks source link

Initial migrate container file #55

Closed suujia closed 6 years ago

suujia commented 6 years ago

https://github.com/bcgsc/orca/issues/33

Walkthrough:

  1. User logins
  2. If user doesn't have account: create container in the less loaded server (or can have default start location);
  3. If user have account: check which server is less loaded, migrate their container there (Call orca-migrate-container script)
sjackman commented 6 years ago

Good start!

suujia commented 6 years ago

Thanks for your input @sjackman ! :)

suujia commented 6 years ago

Testing both of these commands out, it seems to take a lot longer to complete this:

$docker export $CONTAINER | gzip > $NAME.tar.gz
$ zcat $NAME.tar.gz | docker import - $NAME

So this seems to be the better solution comparatively, however it does take quite some time as well.

docker export --output="$NAME.tar" $CONTAINER
docker import $NAME.tar

then finally for both of these options, run docker run -ti NAME to start up the container again on the other host. In summary, both options time some time to process. I will look into see if there are any alternatives.

Oh also export and import only works on running and paused containers so I have included that change as well

suujia commented 6 years ago

Apparently you can also move over /var/lib/docker directory to the new host because it contains everything: containers, images, current status, networks and volumes state (need to stop all the containers)

service docker stop
tar -C /var/lib -czf /tmp/docekrlib.tgz docker

## move dockerlib.tgz to new sever

tar -C /var/lib -xzf /tmp/docekrlib.tgz
service docker restart

OR with kubernetes, using a service, you put label names on the hosts

================== However, if we are planning to switch over to singularity anytime soon, these methods are not entirely compatible.

The image grows and shrinks in real time as you install or delete files within the container. 
If you want to copy a container, you copy the image.

https://www.melbournebioinformatics.org.au/documentation/running_jobs/singularity/
1. export docker container into tar 
 --must have singularity installed in local computer:  'create' and 'import' operations of Singularity require root
2. Create an empty singularity image, and then import the exported docker image into it
3. Upload singularity image to server  (scp file.tar user@orca) 

this seems to be all users are doing at melbourne bioinformatics:

$ ssh yourusername@snowy.melbournebioinformatics.org.au
$ module load singularity
$ singularity exec ubuntu.img /bin/bash
$ cat /etc/issue
Ubuntu 16.04.1 LTS \n \l

and also since singularity is just one file that individuals save, I wonder if we can save that image of any changes they have on the NFS (tar) and start it up on either hosts whenever they log in? I noticed we have a singularity file in this repo, but if you wanted something that might be able to work right away, it allows the feature of importing a docker image from dockerhub. article on singularity in scientific computing (long)

sjackman commented 6 years ago

Yes, we started looking into using ORCA with Singularity. Systems would prefer Singularity, because Docker's Engine must be run as root. You could also start looking into Singularity. We ought to meet to discuss Singularity then. In the mean time, I'd like to continue exploring docker export/import.

sjackman commented 6 years ago

Regarding docker export --output="$NAME.tar" $CONTAINER1 vs docker export $CONTAINER | gzip > $NAME.tar.gz, since we plan on deleting the container tar file as soon as it's transferred, there's really no need to compress it (unless it made the transfer much faster to first compress it). Note that gzip is single threaded and pigz (parallel gzip) is much faster, because it's multithreaded.

suujia commented 6 years ago

I manually tested just this portion and it completed the move successfully from orca01 to orca02. However, the total time it took was ~15 minutes.

$ docker inspect -f '{{.State.Paused}}' e07deee55709
false
$ docker inspect -f '{{.State.Dead}}' e07deee55709
false
$docker stop e07deee55709
$docker export e07deee55709 > schen_migrate.tar
$docker rm e07deee55709

then I hopped onto the other server

$docker import 1c026c1421c8
$docker attach 1c026c1421c8

$docker images schen_migrate latest 9b5ef631a435 2 days ago 21.45 GB

sjackman commented 6 years ago

You're writing schen_migrate.tar to NFS, which avoids needing to use scp to transfer the container tar file. Of the 15 minutes total time, how long did docker export take and how long did docker import take? I'm curious which is the bottle neck. How big is the container tar file? If you use docker export and write instead to /dev/null or a local disk like /var/tmp/ how long does docker export take?

15 minutes is too long to do it just-in-time when the user ssh'es into the server, but it could be done over-night by a cron job to rebalance the servers.

suujia commented 6 years ago

OH actually export took 4 minutes today and importing 8 minutes. I just kept the same schen.tar file and overwrote it. (We can also delete it every time, tar doesn't do efficient overwrites).

When I export it to var/tmp

systemd-private-d7cf693a7dfa42b....

shows up rather than the actual tar file. And I don't have permissions to import it. (I guess we can scp?) Anyway, it's not faster at all. And I was not able to access dev/null

the estimated file size of schen.tar. Really big number (?) The following command, estimates the tar file size ( in KB ) before you create the tar file.

$ tar -cf - schen.tar | wc -c
21666539520

I will look into making it a cron job

sjackman commented 6 years ago

For the timing, you can try these three commands.

time docker export e07deee55709 >~/schen_migrate.tar
time docker export e07deee55709 >/var/tmp/schen_migrate.tar
time docker export e07deee55709 >/dev/null

To check file size, try

du -h ~/schen_migrate.tar

21.7 GB is larger than I expected for a container file. It must include the image as well. I wonder whether we can export just the container without also exporting the image. Since both machines already have a copy of the linuxbrew/linuxbrew image, it's not necessary to transfer it. Could you please look into that?

suujia commented 6 years ago
$ time docker export 66bafa0f65c5 >/var/tmp/schen_migrate.tar

real    5m33.956s
user    0m9.493s
sys 0m45.335s

$ time docker export 2570b8435d3c >/dev/null/

real    3m18.498s
user    0m17.618s
sys 0m22.163s

$ time docker export 2570b8435d3c > schen.tar

real    3m22.749s
user    0m6.095s
sys 0m32.490s

$ du -h ~/schen.tar
21G /home/schen/schen.tar
sjackman commented 6 years ago

Cool. Thanks! So writing to NFS takes about two minutes longer than writing to local disk.

suujia commented 6 years ago

There is currently an open issue that is requesting the feature of exporting certain layers. And it seems to have been rejected in this other issue.

Thanks for the contribution, however we are working on phasing out the concept of
layers from the UX and provide higher abstractions for our image definition format. This
feature you add is very dependent on the notion of layer, so we're -1 on its design, sorry.

So I guess if we were sticking to export/import operation then we'd have to do it in a cron job or

$docker history bcgsc/orca
397f5861fb5a        3 months ago        /bin/sh -c #(nop)  LABEL maintainer=Shaun Jac   0 B                 
<missing>           3 months ago        /bin/sh -c brew install quast quest quorum ra   1.524 GB            
<missing>           3 months ago        /bin/sh -c brew install macse mafft makedepen   4.659 GB            
<missing>           3 months ago        /bin/sh -c cpanm Bit::Vector DBD::SQLite DBI    54.35 MB            
...

===

I was also reading into singularity and it seems that most commonly, people just add to their images on their local machines (with sudo access). Then they compress it and move it onto a server to run things. Individuals do not have root access to change their images once they are on the server.

sjackman commented 6 years ago

We plan for the linuxbrew Singularity image to be read only. Each user will have a read-write overlay on top of that read-only image, that allows them to make changes to their personal container. See http://singularity.lbl.gov/docs-overlay and these related issues #43 and #44