Closed ijiraq closed 3 years ago
Making the VM and run docker run should work as a hack short term solution, but not long term. Here are some quick thoughts onto what to check:
- the container image has to be on the VM image (not on a separate volume)
Could the VM pull the container from Docker at run time? It might be quick enough?
- the container image has to be on the VM image (not on a separate volume)
Could the VM pull the container from Docker at run time? It might be quick enough?
A script on the VM could pull the image at run time but:
Hi all, I'm just playing around in Batch and noticed I have a year-old VM in there (not sure how it got there). @dbohlender does this belong to you? I'm just checking before I delete it.
Yes, go ahead and delete it. About once a year I try to use a VM to generate some non-LTE model atmospheres and spectrum synthesis. And then forget how to do it again...
Perfect, thank you!! (Ha, sounds like a job for a container...........)
Yes, exactly!!!
- the container image has to be on the VM image (not on a separate volume)
Could the VM pull the container from Docker at run time? It might be quick enough?
A script on the VM could pull the image at run time but:
- the container registry is not on the same network, and the batch system would pull hundreds of images simultaneously
- we would have to make sure there is only one image per VM while there may be several jobs on the VM.
@sfabbro thanks for your help. Could you elaborate on why there should only be one image per VM? Is that just to limit unnecessary downloads of the container image(say, from DockerHub)?
Also to make sure I'm not just going down a rabbit hole, @ijiraq @dbohlender I'm following the instructions here for setting up Batch jobs. Seem like a good place to start?
Sorry, was preparing for and then attending management meeting. Those were the instructions that I followed, but only to batch processing. I've not yet done any batch work.
@sfabbro thanks for your help. Could you elaborate on why there should only be one image per VM? Is that just to limit unnecessary downloads of the container image(say, from DockerHub)?
It is not an absolute necessity, but it will save you space given that you only have 20G and docker images are large, especially if you use anaconda. So the root file system fills up quickly even with the shared layers niceness of docker. If you want to run several simultaneous jobs per VM you will have also to make sure there one job pulling a docker image does not conflict with another job downloading the same docker image on the same VM. Anyway if you really want to do docker run yourself on the VM, have the image pulled before snapshotting the VM to avoid unnecessary downloads.
@sfabbro: good insights to docker/containers/images. I guess in the 'docker run' only the first image that gets loaded to a VM will actually pull from docker-hub, the others will be from the cache. To make the pull from docker-hub minimal a good root image will help, with just minimal nifty bits coming in at launch time.
But there are things to think about to make this not an abuse of network.
Also to make sure I'm not just going down a rabbit hole, @ijiraq @dbohlender I'm following the instructions here for setting up Batch jobs. Seem like a good place to start?
please do follow, and if you find wrong stuff in there take notes so we can update the manual.
First run of the container on batch mode has happened; now time for tweaking. Added rough notes on my work so far in d5eebcfd33c715.
We need to run a few 100 jobs with this processing. To do this we can build a VM that has docker on it and then run these jobs as
docker run
jobs on that VM.