goldmann / docker-squash

Docker image squashing tool
MIT License
828 stars 108 forks source link

Hard links gone after squashing #94

Closed twaugh closed 8 years ago

twaugh commented 8 years ago

This Dockerfile shows up a problem:

FROM fedora
RUN dnf -y install git
CMD git clone -v http://git.savannah.gnu.org/r/hello.git
$ rpm -q docker python3-docker-squash
docker-1.9.1-9.gitee06d03.fc23.x86_64
python3-docker-squash-1.0.0-0.7.rc5.fc23.noarch
$ docker run -it --rm hello
Cloning into 'hello'...
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
POST git-upload-pack (374 bytes)
remote: Counting objects: 4360, done.
remote: Compressing objects: 100% (1077/1077), done.
remote: Total 4360 (delta 3255), reused 4360 (delta 3255)
Receiving objects: 100% (4360/4360), 7.58 MiB | 46.00 KiB/s, done.
Resolving deltas: 100% (3255/3255), done.
Checking connectivity... done.
$ docker history hello
IMAGE               CREATED              CREATED BY                                      SIZE                COMMENT
05b761cf2d45        56 seconds ago       /bin/sh -c #(nop) CMD ["/bin/sh" "-c" "git cl   0 B                 
cefa244a5501        About a minute ago   /bin/sh -c dnf -y install git                   233.7 MB            
9bdb5101e5fc        11 weeks ago         /bin/sh -c #(nop) ADD file:bcb5e5cddd4c4d1cac   204.7 MB            
6888fc827a3f        11 weeks ago         /bin/sh -c #(nop) MAINTAINER Patrick Uiterwij   0 B                 
$ sudo docker-squash -v hello -f 6888fc827a3f -t hello:squashed
2016-05-24 22:25:52,289 root         DEBUG    Running version 1.0.0rc5
2016-05-24 22:25:52,290 root         DEBUG    Preparing Docker client...
2016-05-24 22:25:52,290 docker.auth.auth DEBUG    Found 'auths' section
2016-05-24 22:25:52,290 docker.auth.auth DEBUG    Found entry (registry='registry.devops-osbs.openshift.com', username='twaugh@redhat.com')
2016-05-24 22:25:52,305 requests.packages.urllib3.connectionpool DEBUG    "GET /version HTTP/1.1" 200 211
2016-05-24 22:25:52,308 requests.packages.urllib3.connectionpool DEBUG    "GET /v1.21/_ping HTTP/1.1" 200 2
2016-05-24 22:25:52,308 root         DEBUG    Docker client ready
2016-05-24 22:25:52,322 requests.packages.urllib3.connectionpool DEBUG    "GET /v1.21/version HTTP/1.1" 200 211
2016-05-24 22:25:52,323 root         INFO     docker-squash version 1.0.0rc5, Docker ee06d03/1.9.1, API 1.21...
2016-05-24 22:25:52,324 root         INFO     Using v1 image format
2016-05-24 22:25:52,325 root         DEBUG    Using /tmp/docker-squash-p6rgoqb6 as the temporary directory
2016-05-24 22:25:52,327 requests.packages.urllib3.connectionpool DEBUG    "GET /v1.21/images/hello/json HTTP/1.1" 200 1625
2016-05-24 22:25:52,329 requests.packages.urllib3.connectionpool DEBUG    "GET /v1.21/images/05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c/history HTTP/1.1" 200 932
2016-05-24 22:25:52,330 root         INFO     Old image has 4 layers
2016-05-24 22:25:52,330 root         DEBUG    Old layers: ['6888fc827a3f086ff6cd336432071b4f5090c63e89f5fc7873e7b46476c8cbe2', '9bdb5101e5fce92817d6a10364081ef1ca8e42c1c6b4fc5b509d52475bd9e2dc', 'cefa244a550101bc94be4a429342375ee6a48656c5248aef8b05aece650b7ef0', '05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c']
2016-05-24 22:25:52,330 root         DEBUG    We detected layer as the argument to squash
2016-05-24 22:25:52,333 requests.packages.urllib3.connectionpool DEBUG    "GET /v1.21/images/6888fc827a3f/json HTTP/1.1" 200 1252
2016-05-24 22:25:52,333 root         DEBUG    Layer ID to squash from: 6888fc827a3f086ff6cd336432071b4f5090c63e89f5fc7873e7b46476c8cbe2
2016-05-24 22:25:52,333 root         INFO     Checking if squashing is necessary...
2016-05-24 22:25:52,334 root         INFO     Attempting to squash last 3 layers...
2016-05-24 22:25:52,334 root         DEBUG    Layers to squash: ['9bdb5101e5fce92817d6a10364081ef1ca8e42c1c6b4fc5b509d52475bd9e2dc', 'cefa244a550101bc94be4a429342375ee6a48656c5248aef8b05aece650b7ef0', '05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c']
2016-05-24 22:25:52,334 root         DEBUG    Layers to move: ['6888fc827a3f086ff6cd336432071b4f5090c63e89f5fc7873e7b46476c8cbe2']
2016-05-24 22:25:52,334 root         INFO     Saving image 05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c to /tmp/docker-squash-p6rgoqb6/old/image.tar file...
2016-05-24 22:25:52,334 root         DEBUG    Try #1...
2016-05-24 22:25:56,933 requests.packages.urllib3.connectionpool DEBUG    "GET /v1.21/images/05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c/get HTTP/1.1" 200 None
2016-05-24 22:25:57,527 root         INFO     Image saved!
2016-05-24 22:25:57,527 root         INFO     Unpacking /tmp/docker-squash-p6rgoqb6/old/image.tar tar file to /tmp/docker-squash-p6rgoqb6/old directory
2016-05-24 22:25:57,786 root         INFO     Archive unpacked!
2016-05-24 22:25:57,787 root         DEBUG    Removing exported tar (/tmp/docker-squash-p6rgoqb6/old/image.tar)...
2016-05-24 22:25:57,826 root         INFO     Squashing image 'hello'...
2016-05-24 22:25:57,826 root         INFO     Starting squashing...
2016-05-24 22:25:57,826 root         INFO     Squashing file '/tmp/docker-squash-p6rgoqb6/old/05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c/layer.tar'...
2016-05-24 22:25:57,826 root         DEBUG    Searching for marker files in '/tmp/docker-squash-p6rgoqb6/old/05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c/layer.tar' archive...
2016-05-24 22:25:57,827 root         INFO     Squashing file '/tmp/docker-squash-p6rgoqb6/old/cefa244a550101bc94be4a429342375ee6a48656c5248aef8b05aece650b7ef0/layer.tar'...
2016-05-24 22:25:58,034 root         DEBUG    Searching for marker files in '/tmp/docker-squash-p6rgoqb6/old/cefa244a550101bc94be4a429342375ee6a48656c5248aef8b05aece650b7ef0/layer.tar' archive...
2016-05-24 22:25:58,034 root         DEBUG    Found 'var/lib/dnf/history/.wh.history-2016-03-04.sqlite-journal' marker file
2016-05-24 22:25:58,034 root         DEBUG    Marker file 'var/lib/dnf/history/.wh.history-2016-03-04.sqlite-journal' not found in the squashed files, we'll try at the end of squashing one more time
2016-05-24 22:25:58,793 root         DEBUG    Skipping 'var/lib/dnf/history/.wh.history-2016-03-04.sqlite-journal' file because it's on the list to skip files
2016-05-24 22:25:58,866 root         INFO     Squashing file '/tmp/docker-squash-p6rgoqb6/old/9bdb5101e5fce92817d6a10364081ef1ca8e42c1c6b4fc5b509d52475bd9e2dc/layer.tar'...
2016-05-24 22:25:59,382 root         DEBUG    Searching for marker files in '/tmp/docker-squash-p6rgoqb6/old/9bdb5101e5fce92817d6a10364081ef1ca8e42c1c6b4fc5b509d52475bd9e2dc/layer.tar' archive...
2016-05-24 22:25:59,384 root         DEBUG    Skipping 'etc' file because it's older than file already added to the archive
2016-05-24 22:25:59,397 root         DEBUG    Skipping 'etc/group' file because it's older than file already added to the archive
2016-05-24 22:25:59,397 root         DEBUG    Skipping 'etc/group-' file because it's older than file already added to the archive
2016-05-24 22:25:59,397 root         DEBUG    Skipping 'etc/gshadow' file because it's older than file already added to the archive
2016-05-24 22:25:59,397 root         DEBUG    Skipping 'etc/gshadow-' file because it's older than file already added to the archive
2016-05-24 22:25:59,401 root         DEBUG    Skipping 'etc/ld.so.cache' file because it's older than file already added to the archive
2016-05-24 22:25:59,411 root         DEBUG    Skipping 'etc/pki' file because it's older than file already added to the archive
2016-05-24 22:25:59,417 root         DEBUG    Skipping 'etc/pki/nssdb' file because it's older than file already added to the archive
2016-05-24 22:25:59,418 root         DEBUG    Skipping 'etc/pki/nssdb/cert9.db' file because it's older than file already added to the archive
2016-05-24 22:25:59,418 root         DEBUG    Skipping 'etc/pki/nssdb/key4.db' file because it's older than file already added to the archive
2016-05-24 22:25:59,454 root         DEBUG    Skipping 'etc/profile.d' file because it's older than file already added to the archive
2016-05-24 22:25:59,493 root         DEBUG    Skipping 'root' file because it's older than file already added to the archive
2016-05-24 22:25:59,494 root         DEBUG    Skipping 'run' file because it's older than file already added to the archive
2016-05-24 22:25:59,496 root         DEBUG    Skipping 'usr' file because it's older than file already added to the archive
2016-05-24 22:25:59,496 root         DEBUG    Skipping 'usr/bin' file because it's older than file already added to the archive
2016-05-24 22:25:59,615 root         DEBUG    Skipping 'usr/lib' file because it's older than file already added to the archive
2016-05-24 22:25:59,950 root         DEBUG    Skipping 'usr/lib/rpm' file because it's older than file already added to the archive
2016-05-24 22:25:59,951 root         DEBUG    Skipping 'usr/lib/rpm/macros.d' file because it's older than file already added to the archive
2016-05-24 22:26:00,152 root         DEBUG    Skipping 'usr/lib64' file because it's older than file already added to the archive
2016-05-24 22:26:00,451 root         DEBUG    Skipping 'usr/lib64/python3.4' file because it's older than file already added to the archive
2016-05-24 22:26:01,035 root         DEBUG    Skipping 'usr/lib64/python3.4/site-packages' file because it's older than file already added to the archive
2016-05-24 22:26:01,039 root         DEBUG    Skipping 'usr/lib64/python3.4/site-packages/hawkey' file because it's older than file already added to the archive
2016-05-24 22:26:01,226 root         DEBUG    Skipping 'usr/libexec' file because it's older than file already added to the archive
2016-05-24 22:26:01,312 root         DEBUG    Skipping 'usr/share' file because it's older than file already added to the archive
2016-05-24 22:26:01,448 root         DEBUG    Skipping 'usr/share/bash-completion' file because it's older than file already added to the archive
2016-05-24 22:26:01,448 root         DEBUG    Skipping 'usr/share/bash-completion/completions' file because it's older than file already added to the archive
2016-05-24 22:26:01,530 root         DEBUG    Skipping 'usr/share/doc' file because it's older than file already added to the archive
2016-05-24 22:26:01,531 root         DEBUG    Skipping 'usr/share/emacs' file because it's older than file already added to the archive
2016-05-24 22:26:01,531 root         DEBUG    Skipping 'usr/share/emacs/site-lisp' file because it's older than file already added to the archive
2016-05-24 22:26:01,533 root         DEBUG    Skipping 'usr/share/emacs/site-lisp/site-start.d' file because it's older than file already added to the archive
2016-05-24 22:26:01,704 root         DEBUG    Skipping 'usr/share/licenses' file because it's older than file already added to the archive
2016-05-24 22:26:01,831 root         DEBUG    Skipping 'usr/share/locale' file because it's older than file already added to the archive
2016-05-24 22:26:01,842 root         DEBUG    Skipping 'usr/share/locale/en_GB' file because it's older than file already added to the archive
2016-05-24 22:26:01,842 root         DEBUG    Skipping 'usr/share/locale/en_GB/LC_MESSAGES' file because it's older than file already added to the archive
2016-05-24 22:26:01,849 root         DEBUG    Skipping 'usr/share/man' file because it's older than file already added to the archive
2016-05-24 22:26:01,850 root         DEBUG    Skipping 'usr/share/man/man1' file because it's older than file already added to the archive
2016-05-24 22:26:01,852 root         DEBUG    Skipping 'usr/share/man/man3' file because it's older than file already added to the archive
2016-05-24 22:26:01,856 root         DEBUG    Skipping 'usr/share/man/man5' file because it's older than file already added to the archive
2016-05-24 22:26:01,857 root         DEBUG    Skipping 'usr/share/man/man7' file because it's older than file already added to the archive
2016-05-24 22:26:01,858 root         DEBUG    Skipping 'usr/share/man/man8' file because it's older than file already added to the archive
2016-05-24 22:26:03,006 root         DEBUG    Skipping 'var' file because it's older than file already added to the archive
2016-05-24 22:26:03,007 root         DEBUG    Skipping 'var/cache' file because it's older than file already added to the archive
2016-05-24 22:26:03,008 root         DEBUG    Skipping 'var/cache/dnf' file because it's older than file already added to the archive
2016-05-24 22:26:03,008 root         DEBUG    Skipping 'var/cache/ldconfig' file because it's older than file already added to the archive
2016-05-24 22:26:03,008 root         DEBUG    Skipping 'var/cache/ldconfig/aux-cache' file because it's older than file already added to the archive
2016-05-24 22:26:03,012 root         DEBUG    Skipping 'var/lib' file because it's older than file already added to the archive
2016-05-24 22:26:03,015 root         DEBUG    Skipping 'var/lib/dnf' file because it's older than file already added to the archive
2016-05-24 22:26:03,015 root         DEBUG    Skipping 'var/lib/dnf/history' file because it's older than file already added to the archive
2016-05-24 22:26:03,015 root         DEBUG    Skipping 'var/lib/dnf/history/2016-03-04' file because it's older than file already added to the archive
2016-05-24 22:26:03,017 root         DEBUG    Skipping 'var/lib/dnf/history/history-2016-03-04.sqlite' file because it's older than file already added to the archive
2016-05-24 22:26:03,017 root         DEBUG    Skipping 'var/lib/dnf/history/history-2016-03-04.sqlite-journal' file because it's on the list to skip files
2016-05-24 22:26:03,017 root         DEBUG    Skipping 'var/lib/dnf/yumdb' file because it's older than file already added to the archive
2016-05-24 22:26:03,138 root         DEBUG    Skipping 'var/lib/dnf/yumdb/f' file because it's older than file already added to the archive
2016-05-24 22:26:03,154 root         DEBUG    Skipping 'var/lib/dnf/yumdb/g' file because it's older than file already added to the archive
2016-05-24 22:26:03,223 root         DEBUG    Skipping 'var/lib/dnf/yumdb/l' file because it's older than file already added to the archive
2016-05-24 22:26:03,442 root         DEBUG    Skipping 'var/lib/dnf/yumdb/o' file because it's older than file already added to the archive
2016-05-24 22:26:03,448 root         DEBUG    Skipping 'var/lib/dnf/yumdb/p' file because it's older than file already added to the archive
2016-05-24 22:26:03,518 root         DEBUG    Skipping 'var/lib/dnf/yumdb/r' file because it's older than file already added to the archive
2016-05-24 22:26:03,623 root         DEBUG    Skipping 'var/lib/rpm' file because it's older than file already added to the archive
2016-05-24 22:26:03,625 root         DEBUG    Skipping 'var/lib/rpm/Basenames' file because it's older than file already added to the archive
2016-05-24 22:26:03,625 root         DEBUG    Skipping 'var/lib/rpm/Conflictname' file because it's older than file already added to the archive
2016-05-24 22:26:03,625 root         DEBUG    Skipping 'var/lib/rpm/Dirnames' file because it's older than file already added to the archive
2016-05-24 22:26:03,626 root         DEBUG    Skipping 'var/lib/rpm/Group' file because it's older than file already added to the archive
2016-05-24 22:26:03,627 root         DEBUG    Skipping 'var/lib/rpm/Installtid' file because it's older than file already added to the archive
2016-05-24 22:26:03,627 root         DEBUG    Skipping 'var/lib/rpm/Name' file because it's older than file already added to the archive
2016-05-24 22:26:03,627 root         DEBUG    Skipping 'var/lib/rpm/Obsoletename' file because it's older than file already added to the archive
2016-05-24 22:26:03,627 root         DEBUG    Skipping 'var/lib/rpm/Packages' file because it's older than file already added to the archive
2016-05-24 22:26:03,627 root         DEBUG    Skipping 'var/lib/rpm/Providename' file because it's older than file already added to the archive
2016-05-24 22:26:03,628 root         DEBUG    Skipping 'var/lib/rpm/Requirename' file because it's older than file already added to the archive
2016-05-24 22:26:03,629 root         DEBUG    Skipping 'var/lib/rpm/Sha1header' file because it's older than file already added to the archive
2016-05-24 22:26:03,629 root         DEBUG    Skipping 'var/lib/rpm/Sigmd5' file because it's older than file already added to the archive
2016-05-24 22:26:03,632 root         DEBUG    Skipping 'var/lib/rpm/__db.001' file because it's older than file already added to the archive
2016-05-24 22:26:03,633 root         DEBUG    Skipping 'var/lib/rpm/__db.002' file because it's older than file already added to the archive
2016-05-24 22:26:03,633 root         DEBUG    Skipping 'var/lib/rpm/__db.003' file because it's older than file already added to the archive
2016-05-24 22:26:03,641 root         DEBUG    Skipping 'var/log' file because it's older than file already added to the archive
2016-05-24 22:26:03,664 root         DEBUG    Generating list of files in layer '6888fc827a3f086ff6cd336432071b4f5090c63e89f5fc7873e7b46476c8cbe2'...
2016-05-24 22:26:03,664 root         DEBUG    Done, found 0 files
2016-05-24 22:26:03,664 root         DEBUG    Marker files to add: ['var/lib/dnf/history/.wh.history-2016-03-04.sqlite-journal']
2016-05-24 22:26:03,664 root         DEBUG    Adding 'var/lib/dnf/history/.wh.history-2016-03-04.sqlite-journal' marker file back...
2016-05-24 22:26:03,664 root         INFO     Squashing finished!
2016-05-24 22:26:03,676 root         DEBUG    Moving unmodified layer '6888fc827a3f086ff6cd336432071b4f5090c63e89f5fc7873e7b46476c8cbe2'...
2016-05-24 22:26:03,676 root         DEBUG    Reading JSON metadata file '/tmp/docker-squash-p6rgoqb6/old/05b761cf2d45efd05d5cf4a1089d82ffa297fdc43d43c4fa66bd0bb3ec604e8c/json'...
2016-05-24 22:26:03,677 root         INFO     New squashed image ID is 53a22593b60f33cffdcea3a8d597f45e9c6387f453f2f069c8a4826ebc5c6c7f
2016-05-24 22:26:03,677 root         DEBUG    Generating tar archive for the squashed image...
2016-05-24 22:26:03,964 root         DEBUG    Archive generated
2016-05-24 22:26:03,964 root         DEBUG    Loading squashed image...
2016-05-24 22:26:13,883 requests.packages.urllib3.connectionpool DEBUG    "POST /v1.21/images/load HTTP/1.1" 200 0
2016-05-24 22:26:13,884 root         DEBUG    Image loaded!
2016-05-24 22:26:13,913 root         INFO     Image registered in Docker daemon as hello:squashed
2016-05-24 22:26:13,913 root         DEBUG    Cleaning up /tmp/docker-squash-p6rgoqb6 temporary directory
2016-05-24 22:26:13,985 root         INFO     Done

Now let's try running the squashed version:

$ docker run --rm hello:squashed
Cloning into 'hello'...
fatal: Unable to find remote helper for 'http'
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.

CC @mmilata

twaugh commented 8 years ago

Comparing working to non-working containers:

Working (unsquashed):

$ docker export happy_swartz |tar tv | grep git-core/git-remote
hrwxr-xr-x 0/0               0 2016-03-18 11:49 usr/libexec/git-core/git-remote link to usr/bin/git
hrwxr-xr-x 0/0               0 2016-03-18 11:49 usr/libexec/git-core/git-remote-ext link to usr/bin/git
hrwxr-xr-x 0/0               0 2016-03-18 11:49 usr/libexec/git-core/git-remote-fd link to usr/bin/git
-rwxr-xr-x 0/0         1022176 2016-03-18 11:49 usr/libexec/git-core/git-remote-ftp
hrwxr-xr-x 0/0               0 2016-03-18 11:49 usr/libexec/git-core/git-remote-ftps link to usr/libexec/git-core/git-remote-ftp
hrwxr-xr-x 0/0               0 2016-03-18 11:49 usr/libexec/git-core/git-remote-http link to usr/libexec/git-core/git-remote-ftp
hrwxr-xr-x 0/0               0 2016-03-18 11:49 usr/libexec/git-core/git-remote-https link to usr/libexec/git-core/git-remote-ftp

Non-working (squashed):

$ docker export clever_tesla |tar tv | grep git-core/git-remote
-rwxr-xr-x 0/0         1022176 2016-03-18 11:49 usr/libexec/git-core/git-remote-ftp
twaugh commented 8 years ago

Minimal test case:

FROM busybox
RUN mkdir -p /usr/libexec/git-core && \
    echo foo > /usr/libexec/git-core/git-remote-ftp && \
    ln /usr/libexec/git-core/git-remote-ftp \
       /usr/libexec/git-core/git-remote-http
CMD /bin/bash

How to test:

  1. docker build --rm -t hardlink .
  2. docker-squash hardlink -f 3 -t hardlink:squashed
  3. docker run --name=before hardlink true
  4. docker run --name=after hardlink:squashed true
  5. diff -U0 <(docker export before | tar tv) <(docker export after | tar tv)

I see:

--- /dev/fd/63  2016-05-25 10:52:17.786930081 +0100
+++ /dev/fd/62  2016-05-25 10:52:17.786930081 +0100
@@ -394 +393,0 @@
-hrw-r--r-- 0/0               0 2016-05-25 10:51 usr/libexec/git-core/git-remote-http link to usr/libexec/git-core/git-remote-ftp
goldmann commented 8 years ago

Give me a few minutes go get back at the desk and I'll fix it ;)

Thanks for this report! We need more integration tests for hardlinks, tricky stuff.