Open ghost opened 5 years ago
Sounds like your docker host machine did ran out of space if command docker commit
fails because of not having enough space.
Could you check if the machine where you are building that docker-in-docker image with prepulled images really has enough disk space? It will need lot more than 1.5GB there.
@elhigu yes i still has hundreds of GB of space. I tried in a colleague's machine (which has lots of capacity) and the same error occurs. I'm using Docker for Mac
I tried to pull alpine:3.10.1
instead just to try it out. It succeeds
Status: Downloaded newer image for alpine:3.10.1
docker.io/library/alpine:3.10.1
++ docker exec temp df -m /var-lib-docker
++ grep /var-lib-docker
++ awk '{print $3}'
+ USED_MB=58
++ expr 58 + 2048
+ TRIM_TO_MB=2106
++ expr 2106 / 1024
+ TRIM_TO_GB=2
+ echo 'Resizing ext4 to 2GB'
Resizing ext4 to 2GB
+ docker exec temp sh -c 'echo 2 > /trim-ext4-on-next-start.txt'
+ docker stop temp
temp
+ docker start temp
temp
+ docker exec temp rm -fr /var-lib-docker/runtime
+ docker commit temp aldredb/fabric-dind:1.4.1
sha256:667c9c1060b07f061bcd982e19aa632ae180f96ffb7acaf1ea9fb4c56cbeeb65
+ docker stop temp
temp
+ docker rm temp
temp
But it is a bit weird that the resulting image size is 53.9GB
➜ gitlab-custom-dind git:(master) ✗ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
aldredb/fabric-dind 1.4.1 2400bea9f8b9 8 minutes ago 53.9GB
custom-dind latest c8c6e315ce42 16 minutes ago 230MB
docker dind 6ce0d31cf4d6 3 hours ago 230MB
Looks like trimming down disk image after startup didn't ran correctly. I actually have more uptodate trimming code with more debug info here somewhere. I could check that out if I have done some fixing there.
System works in a way that it genreated 60GB sparse file in docker container and installs everything there. Then it tries to trim it down to smaller size if magic file is found during container startup. Trimming down is necessary because docker's layer filesystem doesn't support gzipping sparse files, which causes container size to explode when it is committed to new image.
They actually added one feature (after my feature request) to busybox which would now allow to make small image file during startup and which then could be resized without restart when system notices that it is about to fill. I'll check if latest docker:dind images are already using it and if so I'll fix that resizing method to grow that ext4 file instead of trimming down.
You could also get more debug info why trimming didn't work by checking docker logs after restarting prefilled container
+ docker start temp
temp
// ---- check docker logs here!
+ docker exec temp rm -fr /var-lib-docker/runtime
These are the logs of temp
...
time="2019-07-24T02:56:38.815294200Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
time="2019-07-24T02:56:38.815818200Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc0007596d0, TRANSIENT_FAILURE" module=grpc
time="2019-07-24T02:56:38.815861800Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc0007596d0, CONNECTING" module=grpc
e2fsck 1.45.2 (27-May-2019)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/var-lib-docker.loopback.ext4: 561/3276800 files (0.2% non-contiguous), 286162/13107200 blocks
resize2fs 1.45.2 (27-May-2019)
Resizing the filesystem on /var-lib-docker.loopback.ext4 to 524288 (4k) blocks.
there should be more logs... after resizing FS it should truncate the file and check that it is ok.
this is my most recent version of
dockerd-entrypoint.sh
#!/bin/sh
set -e
# this is pretty much instantaneus becuase while container
# is not committed to image sparse files works just fine
if [ ! -e /var-lib-docker.loopback.ext4 ]; then
dd of=/var-lib-docker.loopback.ext4 bs=1 seek=50G count=0
/sbin/mkfs.ext4 -q /var-lib-docker.loopback.ext4
fi
# TODO: create scripts to autoresize partition when docker:dind
# is released, which has this bugfix included
# https://bugs.busybox.net/show_bug.cgi?id=11886
# trim ext4 image file to smaller length if special file is found
if [ -e /trim-ext4-on-next-start.txt ]; then
export TRIM_GIGABYTES=$(cat /trim-ext4-on-next-start.txt)
set -x
fsck.ext4 -y -f /var-lib-docker.loopback.ext4
resize2fs /var-lib-docker.loopback.ext4 ${TRIM_GIGABYTES}G
truncate -s ${TRIM_GIGABYTES}G /var-lib-docker.loopback.ext4
rm -f /trim-ext4-on-next-start.txt
fsck.ext4 -y /var-lib-docker.loopback.ext4
set +x
fi
# if host docker is not running btrfs file system, this will have to copy the whole
# readonly var-lib-docker.loopback.ext4 file to running container... which will use lots
# of space and takes minutes for for example 10GB of data... so to make this fast use
# btrfs (you can even run it in virtual machine and it will be fast...)
mount -t ext4 -o loop /var-lib-docker.loopback.ext4 /var-lib-docker
# no arguments passed
# or first arg is `-f` or `--some-option`
if [ "$#" -eq 0 ] || [ "${1#-}" != "$1" ]; then
# add our default arguments
set -- dockerd \
--data-root=/var-lib-docker \
--host=unix:///var/run/docker.sock \
--host=tcp://0.0.0.0:2375 \
"$@"
fi
if [ "$1" = 'dockerd' ]; then
if [ -x '/usr/local/bin/dind' ]; then
# if we have the (mostly defunct now) Docker-in-Docker wrapper script, use it
set -- '/usr/local/bin/dind' "$@"
fi
# explicitly remove Docker's default PID file to ensure that it can start properly if it was stopped uncleanly (and thus didn't clean up the PID file)
find /run /var/run -iname 'docker*.pid' -delete
fi
exec "$@"
@elhigu that's all the logs. Tried your latest dockerd-entrypoint.sh
, same result
Right... sounds like container is stopped before resizing is complete... In my current project I have a bit different script since I actually load images from gzipped dump instead of registry, but you can see the the waiting parts there.
I'll update also this repo when I get some time to it.
$ cat ../../../.gitlab-caching-dind/create-dind-with-images.sh
#!/bin/sh
set -x
FINAL_IMAGE_NAME=$1
IMAGE_LIST_FILE=$2
UNIQUE_POSTFIX=$3
TEMP_IMAGE_NAME=custom-dind-$UNIQUE_POSTFIX
TEMP_CONTAINER_NAME=fill-images-$UNIQUE_POSTFIX
# create container where to pull images
docker build -t $TEMP_IMAGE_NAME .
docker run --detach --privileged --name $TEMP_CONTAINER_NAME $TEMP_IMAGE_NAME
# wait for container to be fully started up
sleep 5
# load images
for image in $(cat $IMAGE_LIST_FILE); do
gunzip -c $image | docker exec -i $TEMP_CONTAINER_NAME docker load
echo "Done: $image"
done
# find out used disk size and add 2-3GB extra (resize may fail with just +1GB)
USED_MB=$(docker exec $TEMP_CONTAINER_NAME df -m /var-lib-docker | grep /var-lib-docker | awk '{print $3}')
TRIM_TO_MB=$(expr $USED_MB + 3072)
TRIM_TO_GB=$(expr $TRIM_TO_MB / 1024)
echo "Resizing ext4 to ${TRIM_TO_GB}GB"
docker exec $TEMP_CONTAINER_NAME sh -c "echo $TRIM_TO_GB > /trim-ext4-on-next-start.txt"
docker exec $TEMP_CONTAINER_NAME df -h
docker exec $TEMP_CONTAINER_NAME ls -la /
docker stop $TEMP_CONTAINER_NAME
docker start $TEMP_CONTAINER_NAME
# shouldnt be needed... but just in case
until(docker exec $TEMP_CONTAINER_NAME echo 'wait start'); do
sleep 3;
done;
# wait for resize to be ready
while(docker exec $TEMP_CONTAINER_NAME ls -la /trim-ext4-on-next-start.txt); do
sleep 3;
done;
docker logs $TEMP_CONTAINER_NAME
docker exec $TEMP_CONTAINER_NAME df -h
docker exec $TEMP_CONTAINER_NAME ls -la /
docker exec $TEMP_CONTAINER_NAME rm -fr /var-lib-docker/runtimes
docker exec $TEMP_CONTAINER_NAME sh -c 'rm -fr /run/*'
docker commit $TEMP_CONTAINER_NAME $FINAL_IMAGE_NAME
docker stop $TEMP_CONTAINER_NAME
docker rm --force $TEMP_CONTAINER_NAME
I really need to get this repo to gitlab and add CI runner for it to make sure it keeps working.... or maybe I can just emulate it here with travis too...
Hello, I tried your script to download
hyperledger/fabric-ccenv:1.4.1
image which is around 1.5GB, but i encountered error as shown