Closed victorsndvg closed 6 years ago
For Docker, it's the SINGULARITY_CACHEDIR environment variable. For others, it's a temporary directory (tempfile.mkdtemp()).
@victorsndvg do you mean where the temproary stuffs is pulled to while we are building (not the cache with the final images on the host?) We could create an environment variable for that (which defaults to tempfile.mkdtemp if not set). Would that address your need? e.g. you would export SREGISTRY_TMPDIR
or something like that.
Hi @vsoch , yes this is exactly what I mean.
As we are working with big images and /tmp
directory is usually shared, we can experience some conflicts downloading an image with no space left
error. In fact we have detected this error yesterday. I cannot make a diagnose, but I can ensure that the destiny has enough space to store the image.
I think being able to enable/disable the TMPDIR and/or to change it to a local-per-user directory is more flexible and can help to avoid this issue.
I think if we disable the TMPDIR we can follow the singularity strategy. I mean to download the file to the destiny directory but with a random suffix while downloading. After a successful download rename the file.
If the TMPDIR is enabled, use it as and intermediate path to download the image. If the download is successful, then move the image to the right path. By default it can be tempfile.mkdtemp()
What do you think?
Definitely important! I’ll work on this after waking up today :)
Just a quick question before I implement this - are you sure the error isn't coming from the first download to the Singularity cache, whether it's shub images or docker? By default, the image layers (that would be used to build) go to SINGULARITY_CACHEDIR
which is at $HOME/.singularity
. For our users, if they don't change this (and ultimately we do it for them when they load the Singularity module) they get the "no space left on device error." On the other hand, it could very well be the temporary directory if you are just pulling containers from the registry, and don't have issue putting them in the registry final spot (e.g., $SINGULARITY_CACHEDIR/shub).
Hi @vsoch ,
I did some tests monitoring the files and the storage limit in order to reproduce the error.
/tmp/
to have only 1GB of free space. Start 4GB download image with sregistry.
$IMAGE_NAME.random_suffix
is created at /tmp
. While downloading his file increases its size till the free storage is 0%. Then it fails with the following error:
Traceback (most recent call last):
File "/usr/local/bin/sregistry", line 11, in <module>
load_entry_point('sregistry', 'console_scripts', 'sregistry')()
File "/usr/local/lib/sregistry-cli/sregistry/client/__init__.py", line 379, in main
subparser=subparsers[args.command])
File "/usr/local/lib/sregistry-cli/sregistry/client/pull.py", line 51, in main
save=do_save)
File "/usr/local/lib/sregistry-cli/sregistry/main/registry/pull.py", line 104, in pull
show_progress=not self.quiet)
File "/usr/local/lib/sregistry-cli/sregistry/main/base/http.py", line 194, in download
show_progress=show_progress)
File "/usr/local/lib/sregistry-cli/sregistry/main/base/http.py", line 252, in stream
show_progress=show_progress)
File "/usr/local/lib/sregistry-cli/sregistry/main/base/http.py", line 290, in stream_response
filey.write(chunk)
OSError: [Errno 28] No space left on device
/tmp
and also in the device where SREGISTRY_STORAGE
points to. SREGISTRY_STORAGE
has less free quota than 2xSize_of_the_image. Start 4GB download image with sregistry.
$IMAGE_NAME.random_sufffix
is created at /tmp
. The download is successful$IMAGE_NAME
are created, one at $PWD
and the other at $SREGISTRY_STORAGE
!!!. You can see it here:
Every 2.0s: du -sh mso4sc-feelpp-mor-v106.simg;du -... Tue Oct 16 12:35:35 2018
4.2G mso4sc-feelpp-mor-v106.simg 2.4G .singularity/shub/mso4sc-feelpp-mor-v106.simg
As I have not enough quota to store 2 of them I get the following error:
Traceback (most recent call last): File "/usr/local/lib/python3.7/shutil.py", line 557, in move os.rename(src, real_dst) OSError: [Errno 18] Invalid cross-device link: 'mso4sc-feelpp-mor-v106.simg' -> '/home/cesga/vsande/.singularity/shub/mso4sc-feelpp-mor-v106.simg'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/sregistry", line 11, in
Taking the previous post as starting point, I conclude:
SREGISTRY_TMPDIR
$PWD
What do you think¿
Ok great! This confirms the issue is with tmp. I’ll have this for you later today.
Great! thanks! :clap:
And what about the duplicated copies? is this something we can avoid?
Trying to stress sregistry-cli
I did an extra test :laughing: jaja. If I'm located at $SREGISTRY_STORAGE
and try to pull an image it fails because it's trying to write twice in the same path:
$ singularity run -B /mnt shub://sregistry.srv.cesga.es/mso4sc/sregistry:latest pull mso4sc/globus:latest
Progress |===================================| 100.0%
[client|registry] [database|sqlite:////home/cesga/vsande/.singularity/sregistry.db]
Progress |===================================| 100.0%
Traceback (most recent call last):
File "/usr/local/lib/python3.7/shutil.py", line 557, in move
os.rename(src, real_dst)
OSError: [Errno 18] Invalid cross-device link: 'mso4sc-globus-latest.simg' -> '/home/cesga/vsande/.singularity/shub/mso4sc-globus-latest.simg'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/sregistry", line 11, in <module>
load_entry_point('sregistry', 'console_scripts', 'sregistry')()
File "/usr/local/lib/sregistry-cli/sregistry/client/__init__.py", line 379, in main
subparser=subparsers[args.command])
File "/usr/local/lib/sregistry-cli/sregistry/client/pull.py", line 51, in main
save=do_save)
File "/usr/local/lib/sregistry-cli/sregistry/main/registry/pull.py", line 114, in pull
url = manifest['image'])
File "/usr/local/lib/sregistry-cli/sregistry/database/sqlite.py", line 415, in add
shutil.move(image_path, image_name)
File "/usr/local/lib/python3.7/shutil.py", line 571, in move
copy_function(src, real_dst)
File "/usr/local/lib/python3.7/shutil.py", line 257, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/usr/local/lib/python3.7/shutil.py", line 104, in copyfile
raise SameFileError("{!r} and {!r} are the same file".format(src, dst))
shutil.SameFileError: 'mso4sc-globus-latest.simg' and '/home/cesga/vsande/.singularity/shub/mso4sc-globus-latest.simg' are the same file
we should avoid this I think
agreed.
@victorsndvg if you want to disable caching (meaning you just pull to where you are) you can export SREGISTRY_DISABLE_CACHE
or to disable the entire database use, just do SREGISTRY_DISABLE
. The latter will also not require the sqlalchemy dependency, and then sregistry will just work as a client to interact with things (no local database). I don't think I have a good table in the docs to describe these various settings, so I'll add this one to this bit of work.
it sounds that it's perfect for my needs!
hey @victorsndvg here you go! https://github.com/singularityhub/sregistry-cli/pull/158 There are a lot of changes here so please do your (usual) thorough testing. Specifically, we want to know if the environment variable is honored in all use cases (e.g., pull from different clients) and that everything still works A-OK. Thanks in advance!
Hi @vsoch ,
as the title says, Where the images are stored while downloading? Is there any tmp dir?
Is it possible to customize the tmp dir?
Is it possible to disable the storing in a tmp dir?
Thanks in advance!