hilbert / hilbert-cli

Backend management tools: CLI
Apache License 2.0
6 stars 2 forks source link

Support large docker images #49

Closed porst17 closed 6 years ago

porst17 commented 6 years ago

Some data intesive applications require large docker images. Depending on the underlying storage driver used by docker, there is a size limit for containers and images. Currently, this limit is 10GB for the devicemapper storage driver used by default on Fedora. On Ubuntu, docker uses the aufs or overlayfs(2) storage drivers, which do not have a size limit if the underlying filesystem is ext4.

Images larger than 10GB can not be build and run and not even pulled with docker v17.12 and earlier. The download succeeds but docker quits with an error message when it tries to unpack the filesystem into the layer tree (no space left on device).

It is possible to raise the filesystem's base size by editing the file /etc/docker/daemon.json:

{
  "storage-driver": "devicemapper",
  "storage-opts": [
    "dm.basesize=50G"
  ]
}

It is similar for other storage drivers.

However, this requires to systemctl stop docker && rm-rf /var/lib/docker && systemctl start docker, i.e. all known images and containers have to be recreated / pulled again.

There is also docker run --storage-ops size=50G ..., but this only affects the file system inside the container, not of the base image (probably docker commit won't work, but I didn't test this).

Possible solutions or workarounds to this problem should be discussed here.

I am in favor of increasing the dm.basesize storage option globally. But maybe there are drawbacks of this method, I don't see.

vga101 commented 6 years ago

Drawback I see with increasing dm.basesize is that images will only work within a hilbert system. Other people (on Fedora) will not be able to use the image. So depends on how portable you want the images to be.

porst17 commented 6 years ago

You need the hilbert system anyway to run those images. Docker only has built-in abstraction for file systems and network. Support for audio, hardware accelerated video etc. is added by the hilbert system.

If there are no further drawbacks, I would propose to raise dm.basesize.

The alternative of using data images and --volumes-from seems to be rather slow and a waste of disk space due to the copying of the image contents to the shared volume.

The other alternative of creating a volume on the host that is populated with the data on first run reduces portability and breaks our idea of keeping everything inside docker containers. AFAIK, you would leave data behind if you remove the docker image from the host, because there is nothing like an uninstall script.

malex984 commented 6 years ago

Changing basic docker engine setting (basesize) will have to be done on ALL Hilbert-related stations and servers AND wherever one will want to re-build/update images, correct?

porst17 commented 6 years ago

You need to adjust the basesize on all the stations you want to pull and run the large image on and on the machine you are building the image. I am not sure, if it is necessary for the docker registry because it is basically just storing compressed image layers. But I didn't test it.

I only see the three options from above:

  1. increase basesize
  2. use multiple smaller data images and --volumes-from
  3. create a special volume on the client machines and pull application data on first run

All of them have pros and cons, but nevertheless we need to choos one and we need to do it soon.

porst17 commented 6 years ago

Ok, it seems there is another option: Changing the storage driver to overlay(2). Since the underlying file system is ext4, there will be no size limitation. But this is a docker engine setting as well. Also, overlay2 is still considered a little experimental on Fedora (even though it is used as a default on Ubuntu sice 14.04.4/16.04).

porst17 commented 6 years ago

I also noticed, that the default storage driver on Fedora 24 is devicemapper with loop-lvm. Accourding to docker info, WARNING: Usage of loopback devices is strongly discouraged for production use. and is strongly recommended to use direct-lvm mode for performance reasons.

vga101 commented 6 years ago

On loop-lvm: yes, may have performance issues, but recommending to set up direct-lvm is easily said: is quite a bit more troublesome (manual config) and not done with default docker install. I have that set up and running on the ES server (makes sense there since we want good performance, for running and building; you can check it there if you like, works nicely). But IMHO doesn't make sense for clients: no heavy load or I/O so that loopback is a bottleneck, more install complexity, and needs reinstall since LVM partition needed (well, actually only remote repartitioning).

Other storage driver (such as overlay/overlay2/aufs) may be an alternative, but I can't judge their behaviour on Fedora and they probably also require a full docker install reset just as basesize change does. But maybe it's worth a try if full reset required anyway?

porst17 commented 6 years ago

Other storage driver (such as overlay/overlay2/aufs) may be an alternative, but I can't judge their behaviour on Fedora and they probably also require a full docker install reset just as basesize change does. But maybe it's worth a try if full reset required anyway?

overlay2 needs kernel support AFAIK. docker is using it on Fedora 26 by default. Not sure if support in Fedora 24 is mature enough.

malex984 commented 6 years ago

I would also suggest to try using overlay2 as a better storage driver and due to older Driver devicemapper failed to remove root filesystem issue (https://github.com/hilbert/hilbert-cli/issues/45)

porst17 commented 6 years ago

How well is overlay2 supported on Fedora 24?

malex984 commented 6 years ago

It seems that somebody can use overlay2 on Fedora 24 (Workstation Edition) with linux kernel 4.9.0: https://github.com/moby/moby/issues/31038

ps: Fedora 25 Atomic: https://www.projectatomic.io/blog/2017/05/migrate-fedora-atomic-host-to-overlay2/ Fedora 26: https://fedoraproject.org/wiki/Changes/DockerOverlay2

porst17 commented 6 years ago

Default kernel on Fedora 24 is 4.5 AFAIK. This would need further testing.

vga101 commented 6 years ago

Not sure what kernel version we'll have...

Test machine bigfoot80:

[bigfoot80 ~]$ uname -a
Linux bigfoot80 4.7.2-201.fc24.x86_64 #1 SMP Fri Aug 26 15:58:40 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Test tower at exhibition (should be most recent install profile):

[kiosk023298 ~]$ uname -a
Linux kiosk023298 4.11.12-100.fc24.x86_64 #1 SMP Fri Jul 21 17:35:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Don't know why they're so different...

porst17 commented 6 years ago

@malex984 Can you please document how you and @vga101 finally addressed this and close the issue then?

malex984 commented 6 years ago

approach via download and persistent volume seems to work for @vga101 !

porst17 commented 6 years ago

I asked for a documentation. At least a code snipped from the config/compose/hilbert/whateverisneeded config and what to do in the container at runtime then (problably just wget or similar into the right location).

malex984 commented 6 years ago

AFAICS 1113_illustrisexplorer/populate_volume.sh: checks existence of time-stamp file okfile in predefined location (which will be mounted as a local volume). If time-stamp file is missing - it will try to download some .tar.gz from fixed location using wget and unrar/ungzip into the predefined (volume) location - if successful it will create a time-stamp file containing the current date and time.

If update/check was successful - execute further commands from the command line.

porst17 commented 6 years ago

How do you create and mount the local volume? Code snippet please.

malex984 commented 6 years ago

Local volume are created and mounted via corresponding docker-compose.yml, similarly to the following:

version: '2.1'
volumes:
  registry_local:
    driver: local
services:
  registry:
    volumes:
     - registry_local:/LOCAL_PATH
    ...
malex984 commented 6 years ago

once it was created one can inspect it with docker volume inspect ...

porst17 commented 6 years ago

Ok, that's useful documentation. Closing.