Closed LijieTu closed 9 years ago
Please note I am not a docker expert. But some of the folks on the cluster may have some other ideas besides what I say here.
We would recommend for item number 1 you take a look at data volumes and host path access perhaps:
https://docs.docker.com/userguide/dockervolumes/
For item number 2 I'm a bit confused what you are asking. You can specify a path to the container and store the docker image out on GPFS. But please note by default the docker path to the runtime area (-g argument) is set to /scratch/docker just to prevent accidents. You may wish to reset that flag for your particular runs.
Sorry for the confusion. Let me try describing the situation this time.
In question 2, I mean if I request a gpu, say gpu-2-6 for the first time, then I run the command in the link: docker run -it --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm --device /dev/nvidia0:/dev/nvidia0 kaixhin/cuda-torch It shows in the terminal:
Unable to find image 'kaixhin/cuda-torch' locally Pulling repository kaixhin/cuda-torch
and begins to download. After the download, the prompt shows ~/torch#, which means the container is ready.
Next time I request the same gpu-2-6 and run the docker with the same command, there is no need downloading again and I can jump to the ~/torch#.
However, if I request a different gpu, say gpu-2-11, then I have to go through the downloading again before the container is ready. My question is : is there any way out I could avoid downloading the repo when I use a different node from last time?
Hope I make it clear this time. Thanks!
You need to save the image I believe to GPFS and use the docker arguments to reference that image location instead of the repo. There are some examples of this in some past docker Git requests or I can Google around but I'm not really available today due to some other matters. Lets see if the other docker users chime in or I'll take a further look next week.
Sure. I will see if I could get it done somehow. Thanks.
I do show in the default location on numbers of nodes the kaixhin/cuda-torch image already exists.
Are you considering it a "bad thing" to just load it when needed on all nodes and then attach a GPFS based data volume for your persistent data?
There is plenty of room on the node /scratch areas for docker images....aka why are you trying to avoid download the image? Time?
I was thinking if I want to get something else in the future, maybe I do not have to download it every time, just to save time.
Yeah, I guess in theory the docker images repo could be shared on all the nodes and live in GPFS but I've never tried that and we'd have to communicate with the other docker users before doing it.
I don't know enough about docker repos to tell you whats going to happen if I set the repo to some shared GPFS dir however.
So for now I would just batch up on needed nodes a "docker pull kaixhin/cuda-torch" and perhaps another method will become evident.
Seems to only take a few minutes. Its certainly not a disk space issue.
if you want I can just execute that for you all all nodes.
Looks like you could also mess with the import/export features to skip the download. But I only glanced at a writeup on it. YMMV
http://tuhrig.de/difference-between-save-and-export-in-docker/
I just performed a save/load method from that second URL example and it seemed to go ok. Its a big file but perhaps quicker than the repo download. Feel free to time it ;)
Many thanks!! I guess I need more time to get myself familiar with the file management system of docker.
Here's what I did. It seemed a bit faster and if you batched it up as a job probably even quicker: One node with the image already there via a fetch:
docker save -o dockerimages/torch.tar kaixhin/cuda-torch
For all nodes not with image visible via docker images (script left to reader)
docker load -i dockerimages/torch.tar
Also note I will look into shared filesystem docker repos but not today.
I am going to open a separate issue about docker image storage and generic docker use. I believe you have what you need for the current configuration. If not, please re-open
Hello all,
I'm trying to run lua file in the docker but got confused about how it functions.
Thank you!