uber / kraken

P2P Docker registry capable of distributing TBs of data in seconds
Apache License 2.0
6.14k stars 423 forks source link

ImagePullBackOff when deploying Kraken demo on k8s cluster on minikube #269

Open eatwithforks opened 4 years ago

eatwithforks commented 4 years ago

Describe the bug I followed the setup via Helm.

minikube version:

v1.12.1 

kubernetes version:

"v1.18.3"

Started cluster via

helm install --generate-name ./helm
NAME                                  READY   STATUS         RESTARTS   AGE
demo-6c5b558977-r7q25                 0/1     ErrImagePull   0          3m7s
kraken-agent-cms48                    1/1     Running        4          35m
kraken-build-index-7b89c9d4b4-6nvjd   1/1     Running        0          35m
kraken-build-index-7b89c9d4b4-khcfb   1/1     Running        0          35m
kraken-build-index-7b89c9d4b4-rjrgk   1/1     Running        0          35m
kraken-origin-545d6c7f4d-4dt4r        1/1     Running        0          35m
kraken-origin-545d6c7f4d-kzkd4        1/1     Running        0          35m
kraken-origin-545d6c7f4d-vg68b        1/1     Running        0          35m
kraken-proxy-66779469d9-8g2wz         1/1     Running        0          35m
kraken-testfs-8bc6864d9-v6hvz         1/1     Running        0          35m
kraken-tracker-674bb8f6b6-4zq9r       2/2     Running        0          35m
kraken-tracker-674bb8f6b6-b9pm7       2/2     Running        0          35m
kraken-tracker-674bb8f6b6-bwxz8       2/2     Running        0          35m

Then i kubectl apply -f demo.yml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
spec:
  replicas: 1
  selector:
    matchLabels:
      demo: 'true'
  template:
    metadata:
      labels:
        demo: 'true'
    spec:
      containers:
      - name: main
        image: 127.0.0.1:30081/library/busybox:latest
        command:
        - "/bin/sh"
        - "-c"
        - while true; do sleep 10; done

Error response:

Error response from daemon: manifest for localhost:30081/library/busybox:latest not found: manifest unknown: manifest unknown

To Reproduce Steps to reproduce the behavior:

  1. helm install --generate-name ./helm
  2. kubectl apply -f demo.yml
  3. kubectl logs

Expected behavior A clear and concise description of what you expected to happen. Should be able to pull from localhost:30081/library/busybox:latest

Environments MInikube on mac

/cc @ThaoTrann ran into something similar in https://github.com/uber/kraken/issues/179 @yiranwang52

MA357151 commented 4 years ago

Describe the bug I followed the setup via Helm.

minikube version:

v1.12.1 

kubernetes version:

"v1.18.3"

Started cluster via

helm install --generate-name ./helm
NAME                                  READY   STATUS         RESTARTS   AGE
demo-6c5b558977-r7q25                 0/1     ErrImagePull   0          3m7s
kraken-agent-cms48                    1/1     Running        4          35m
kraken-build-index-7b89c9d4b4-6nvjd   1/1     Running        0          35m
kraken-build-index-7b89c9d4b4-khcfb   1/1     Running        0          35m
kraken-build-index-7b89c9d4b4-rjrgk   1/1     Running        0          35m
kraken-origin-545d6c7f4d-4dt4r        1/1     Running        0          35m
kraken-origin-545d6c7f4d-kzkd4        1/1     Running        0          35m
kraken-origin-545d6c7f4d-vg68b        1/1     Running        0          35m
kraken-proxy-66779469d9-8g2wz         1/1     Running        0          35m
kraken-testfs-8bc6864d9-v6hvz         1/1     Running        0          35m
kraken-tracker-674bb8f6b6-4zq9r       2/2     Running        0          35m
kraken-tracker-674bb8f6b6-b9pm7       2/2     Running        0          35m
kraken-tracker-674bb8f6b6-bwxz8       2/2     Running        0          35m

Then i kubectl apply -f demo.yml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
spec:
  replicas: 1
  selector:
    matchLabels:
      demo: 'true'
  template:
    metadata:
      labels:
        demo: 'true'
    spec:
      containers:
      - name: main
        image: 127.0.0.1:30081/library/busybox:latest
        command:
        - "/bin/sh"
        - "-c"
        - while true; do sleep 10; done

Error response:

Error response from daemon: manifest for localhost:30081/library/busybox:latest not found: manifest unknown: manifest unknown

To Reproduce Steps to reproduce the behavior:

  1. helm install --generate-name ./helm
  2. kubectl apply -f demo.yml
  3. kubectl logs

Expected behavior A clear and concise description of what you expected to happen. Should be able to pull from localhost:30081/library/busybox:latest

Environments MInikube on mac

/cc @ThaoTrann ran into something similar in #179 @yiranwang52

Hello, I am also facing same issue. Are you able to resolve this issue?

Thanks! Mayank

Jasstkn commented 4 years ago

I was able to found in logs 404 error:

time="2020-11-17T19:36:00.640658128Z" level=error msg="response completed with error" err.code="manifest unknown" err.detail="unknown tag=latest" err.message="manifest unknown" go.version=go1.11.4 http.request.host=registry-backend http.request.id=b0d54606-1d43-40ca-94ce-0e01a5b3e7af http.request.method=GET http.request.remoteaddr=172.17.0.1 http.request.uri="/v2/library/busybox/manifests/latest" http.request.useragent="docker/19.03.8 go/go1.13.8 git-commit/afacb8b7f0 kernel/5.4.39-linuxkit os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)" http.response.contenttype="application/json; charset=utf-8" http.response.duration=14.319055ms http.response.status=0 http.response.written=0 instance.id=3cd2cd7c-4e2c-4431-93c9-deb0eee0b12c vars.name="library/busybox" vars.reference=latest
@ - - [17/Nov/2020:19:36:00 +0000] "GET /v2/library/busybox/manifests/latest HTTP/1.0" 404 96 "" "docker/19.03.8 go/go1.13.8 git-commit/afacb8b7f0 kernel/5.4.39-linuxkit os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)"
Jasstkn commented 4 years ago

@codygibb Hello! Could you take a look into this issue when you have a chance? Maybe there is some lack of requirements in documentation? Thanks in advance for your help!

theycallmeloki commented 4 years ago

Greetings! I'm running into the same issue, the demo pod is not able to find the registry at 30081 could this be related to running the pods locally on induvidual nodes using a daemonset instead?

Would also be awesome if there was a Slack to ask questions

kong62 commented 3 years ago
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
spec:
  replicas: 1
  selector:
    matchLabels:
      demo: 'true'
  template:
    metadata:
      labels:
        demo: 'true'
    spec:
      containers:
      - name: main
        #image: '127.0.0.1:30081/grafana/grafana:7.3.4'
        image: '127.0.0.1:18081/grafana/grafana:7.3.4'
        command:
        - /bin/sh
        - '-c'
        - while true; do sleep 10; done
# kubectl logs kraken-agent-q4szx
@ - - [04/Dec/2020:06:04:04 +0000] "GET /v2/ HTTP/1.0" 200 2 "" "docker/19.03.12 go/go1.13.10 git-commit/48a66213fe kernel/4.18.0-193.6.3.el8_2.x86_64 os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)"
time="2020-12-04T06:04:04.593230087Z" level=error msg="response completed with error" err.code="manifest unknown" err.detail="unknown tag=7.3.4" err.message="manifest unknown" go.version=go1.14.2 http.request.host=registry-backend http.request.id=20a5ab69-4b77-43a1-8e88-a1c0e950180e http.request.method=GET http.request.remoteaddr=127.0.0.1 http.request.uri=/v2/grafana/grafana/manifests/7.3.4 http.request.useragent="docker/19.03.12 go/go1.13.10 git-commit/48a66213fe kernel/4.18.0-193.6.3.el8_2.x86_64 os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)" http.response.contenttype=application/json http.response.duration=5.417584ms http.response.status=404 http.response.written=95 vars.name=grafana/grafana vars.reference=7.3.4
@ - - [04/Dec/2020:06:04:04 +0000] "GET /v2/grafana/grafana/manifests/7.3.4 HTTP/1.0" 404 95 "" "docker/19.03.12 go/go1.13.10 git-commit/48a66213fe kernel/4.18.0-193.6.3.el8_2.x86_64 os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)"
sunqifs7 commented 3 years ago

anyone resolves this issue?

micya commented 3 years ago

I'm seeing this in k3d and in AKS. Logs from kraken-agent:


2021-04-08T02:33:16.518Z        WARN    networkevent/producer.go:58     Kafka network events disabled
2021-04-08T02:33:16.519Z        WARN    metrics/metrics.go:62   Skipping version emitting: no GIT_DESCRIBE env variable found
2021-04-08T02:33:16.520Z        INFO    httputil/tls.go:60      Client TLS is disabled
2021-04-08T02:33:16.521Z        INFO    bandwidth/limiter.go:88 Setting egress bandwidth to 1.56Gbit/sec
2021-04-08T02:33:16.521Z        INFO    bandwidth/limiter.go:89 Setting ingress bandwidth to 2.34Gbit/sec
2021-04-08T02:33:16.521Z        INFO    scheduler/scheduler.go:194      Scheduler starting as peer b57acdc09a3e142923e5e8bb84b7cd224438d389 on addr 10.244.0.10:8080
2021-04-08T02:33:16.522Z        INFO    scheduler/scheduler.go:319      Listening on [::]:8080
2021-04-08T02:33:16.534Z        INFO    cmd/cmd.go:168  Starting agent server on :80
2021-04-08T02:33:16.534Z        INFO    cmd/cmd.go:173  Starting registry...
2021-04-08T02:33:16.534Z        WARN    nginx/nginx.go:137      Server TLS is disabled
time="2021-04-08T02:34:16.454345093Z" level=error msg="response completed with error" err.code="manifest unknown" err.detail="unknown tag=latest" err.message="manifest unknown" go.version=go1.11.4 http.request.host=registry-backend http.request.id=5d962226-27e0-4851-8cfa-4bda7e2d7c72 http.request.method=HEAD http.request.remoteaddr=10.240.0.4 http.request.uri="/v2/library/busybox/manifests/latest" http.request.useragent="containerd/1.4.4+azure" http.response.contenttype="application/json; charset=utf-8" http.response.duration=6.38673ms http.response.status=0 http.response.written=0 instance.id=4ffb725a-f362-4a03-9da8-347894a6e8d8 vars.name="library/busybox" vars.reference=latest
@ - - [08/Apr/2021:02:34:16 +0000] "HEAD /v2/library/busybox/manifests/latest HTTP/1.0" 404 96 "" "containerd/1.4.4+azure"
time="2021-04-08T02:34:32.075204662Z" level=error msg="response completed with error" err.code="manifest unknown" err.detail="unknown tag=latest" err.message="manifest unknown" go.version=go1.11.4 http.request.host=registry-backend http.request.id=96c717b7-d55a-4205-99c2-bc247f983b65 http.request.method=HEAD http.request.remoteaddr=10.240.0.4 http.request.uri="/v2/library/busybox/manifests/latest" http.request.useragent="containerd/1.4.4+azure" http.response.contenttype="application/json; charset=utf-8" http.response.duration=3.652703ms http.response.status=0 http.response.written=0 instance.id=4ffb725a-f362-4a03-9da8-347894a6e8d8 vars.name="library/busybox" vars.reference=latest
@ - - [08/Apr/2021:02:34:32 +0000] "HEAD /v2/library/busybox/manifests/latest HTTP/1.0" 404 96 "" "containerd/1.4.4+azure"
111
yunkunrao commented 3 years ago

I have encoutered the same issue. Local devcluster is OK, but when kraken is deployed in K8s cluster, pulling image will reproduce the above issue. Could someone has a workaround or can explain the root cause?

yunkunrao commented 3 years ago

I have encoutered the same issue. Local devcluster is OK, but when kraken is deployed in K8s cluster, pulling image will reproduce the above issue. Could someone has a workaround or can explain the root cause?

In my env, the root cause is backend misconfiguartion. If Kraken was deployed in k8s cluster, dockerhub should be added in the backend section.

juliusl commented 2 years ago

EDIT: Appears this fix isn't working for people, leaving for posterity

Under ./helm/values.yaml, replace the settings for build_index and origin with this:

build_index:
  config: /etc/config/build-index.yaml
  replicas: 3
  annotations:
  extraVolumes:
  extraVolumeMounts:
  initContainers:
  extraBackends: |-
   - namespace: library/.*
     backend:
       registry_tag:
         address: index.docker.io
         security:
           basic:
            username: ""
            password: ""

origin:
  config: /etc/config/origin.yaml
  replicas: 3
  annotations:
  extraVolumes:
  extraVolumeMounts:
  initContainers:
  extraBackends: |-
   - namespace: library/.*
     backend:
       registry_blob:
         address: index.docker.io
         security:
           basic:
            username: ""
            password: ""

and then everything should work

nitinpatil1992 commented 2 years ago

@juliusl your suggestion doesn't help in my case. I did try updating log-level in configmap but didn't really help. Do you have some additional references?

juliusl commented 2 years ago

@nitinpatil1992 To summarize, my fix is to include a public registry in the backend config. The existing demo backend config doesn't have any public registries, so image lookup will always fail. Could you elaborate on what your particular case is?

akakream commented 1 year ago

suggestion of @juliusl also does not work for me. Anybody found a solution to this issue?