dragonflyoss / Dragonfly

This repository has be archived and moved to the new repository https://github.com/dragonflyoss/Dragonfly2.
https://d7y.io
Apache License 2.0
6k stars 773 forks source link

dfget pull images always from the remote registry. #1336

Open sequix opened 4 years ago

sequix commented 4 years ago

Ⅰ. Issue Description

Deployed dragonfly in a containerd cluster using KinD, and dfget pull images always from the remote registry.

Ⅱ. Describe what happened

dfget pull images always from the remote registry.

Ⅲ. Describe what you expected to happen

dfget pulls image from peers if the image has already been pulled by others and not expired.

Ⅳ. How to reproduce it (as minimally and precisely as possible)

Prepare enviroment:

docker build -t stargz-kind-node https://github.com/containerd/stargz-snapshotter.git
cat >multinode.yml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
kind create cluster --name stargz-demo --config multinode.yml --image stargz-kind-node --retain
kubectl --context  stargz-demo create ns dragonfly
kubectl --context  stargz-demo -n dragonfly apply supernode.yml dfclient.yml

kubectl --context  stargz-demo exec -it stargz-demo-worker
# within worker container
cat >>/etc/containerd/config.toml <<'EOF'
# See also: https://github.com/containerd/cri/blob/master/docs/registry.md
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
    endpoint = ["http://127.0.0.1:30081"]
EOF
cat >>/etc/containerd-stargz-grpc/config.toml <<'EOF'
[[resolver.host."docker.io".mirrors]]
  host = "http://127.0.0.1:30081"
EOF
systemctl restart stargz-snapshotter
systemctl restart containerd

kubectl  --context  stargz-demo exec -it stargz-demo-worker2
# within worker2 container
cat >>/etc/containerd/config.toml <<'EOF'
# See also: https://github.com/containerd/cri/blob/master/docs/registry.md
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
    endpoint = ["http://127.0.0.1:30081"]
EOF
cat >>/etc/containerd-stargz-grpc/config.toml <<'EOF'
[[resolver.host."docker.io".mirrors]]
  host = "http://127.0.0.1:30081"
EOF
systemctl restart stargz-snapshotter
systemctl restart containerd

Now crictl pull docker.io/library/alpine:3.9 on different worker node one by one. dfdaemon's log will show that dfget download from dockerhub every time it crictl pull

Ⅴ. Anything else we need to know?

Ⅵ. Environment:

dragonfly version

supernode version 1.0.0 Git commit: ac262d5 Build date: 20191119-17:45:07 Go version: go1.12.10 OS/Arch: linux/amd64

dfdaemon version 1.0.0 Git commit: ac262d5 Build date: 20191119-17:44:51 Go version: go1.12.10 OS/Arch: linux/amd64

host: NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"

container: NAME="Alpine Linux" ID=alpine VERSION_ID=3.8.4 PRETTY_NAME="Alpine Linux v3.8" HOME_URL="http://alpinelinux.org" BUG_REPORT_URL="http://bugs.alpinelinux.org"

lowzj commented 4 years ago

dfdaemon will trigger a dfget download task when accepted a pull request. Dragonfly has a plan to make the dfget can download the data from source server directly, but for now it never download from dockerhub directly, it will download from supernode or other peers which have the data.

sequix commented 4 years ago

@lowzj Sure, that is the expected case, but after first crictl pull, at least one dfget or supernode got the image, now dfget should download from peer or supernode, but it didn't. You can see this from dfdaemon's log, worker1 downloaded blob 4780acaf at 01:24:41.022, within less one minute, worker2 donwload the same blob from dockerhub again.

worker1

2020-05-11 01:24:39.473 INFO sign:1 : start download url:https://index.docker.io/v2/library/golang/blobs/sha256:4780acafd98c2f2b2139f70862bf40ecb6c3a84abd2afa853260460edb3fc112 to abc51efa-a17f-4624-adf7-06805c228a7e in repo
2020-05-11 01:24:39.488 INFO sign:1 : start download url:https://index.docker.io/v2/library/golang/blobs/sha256:fbf24695ef9896635c70eaf51fb67c6cc91f306a893b30ab38a29d4dd131c4c9 to c229472b-7c42-4769-a08b-d29aceccda4f in repo
2020-05-11 01:24:41.022 INFO sign:1 : dfget url:https://index.docker.io/v2/library/golang/blobs/sha256:4780acafd98c2f2b2139f70862bf40ecb6c3a84abd2afa853260460edb3fc112 [SUCCESS] cost:1.550s
2020-05-11 01:24:54.097 INFO sign:1 : dfget url:https://index.docker.io/v2/library/golang/blobs/sha256:fbf24695ef9896635c70eaf51fb67c6cc91f306a893b30ab38a29d4dd131c4c9 [SUCCESS] cost:14.608s

worker2

2020-05-11 01:25:13.259 INFO sign:1 : start download url:https://index.docker.io/v2/library/golang/blobs/sha256:4780acafd98c2f2b2139f70862bf40ecb6c3a84abd2afa853260460edb3fc112 to 831d1209-eba8-4948-ba98-1c2e3f2bb8ad in repo
2020-05-11 01:25:13.266 INFO sign:1 : start download url:https://index.docker.io/v2/library/golang/blobs/sha256:38b1453721cb272cf531587daa8f79cc38641a9750dc269d7f2f08feb80f6602 to 2a982fab-ea6b-4425-958a-d72dd3185108 in repo
2020-05-11 01:25:13.359 INFO sign:1 : dfget url:https://index.docker.io/v2/library/golang/blobs/sha256:4780acafd98c2f2b2139f70862bf40ecb6c3a84abd2afa853260460edb3fc112 [SUCCESS] cost:0.100s
2020-05-11 01:25:15.816 INFO sign:1 : scan repo and clean expired files
2020-05-11 01:25:19.168 INFO sign:1 : dfget url:https://index.docker.io/v2/library/golang/blobs/sha256:38b1453721cb272cf531587daa8f79cc38641a9750dc269d7f2f08feb80f6602 [SUCCESS] cost:5.902s
lowzj commented 4 years ago

These logs in dfdaemon.log don't mean that dfget downloads blobs from remote dockerhub, it just means that a download task starts. Only supernode downloads data from dockerhub, caches it and sends it to dfget when no other dfget has the same data.

You can see the cost time of worker2 is much less than the worker1 because that supernode has downloaded the data after first pulling on worker1.

The detailed download task logs are in dfclient.log: https://github.com/dragonflyoss/Dragonfly/blob/master/FAQ.md#how-to-view-all-the-dfget-logs-of-a-task