cloudfoundry / cf-for-k8s

The open source deployment manifest for Cloud Foundry on Kubernetes
Apache License 2.0
300 stars 115 forks source link

Demo App can't running because "OCI runtime create failed" #666

Closed gongzhao2 closed 3 years ago

gongzhao2 commented 3 years ago

Describe the bug

Hi Team, I follow this instruction to deploy cf in China Huawei Cloud kubernetes cluster. Since my cluster can't access to gcr.io, so I download these images manually and uploaded them in my docker hub, them I modify the images address in "cf-for-k8s-rendered.yml"

gcr.io/paketo-buildpacks/ruby@sha256:6f765726dd17b0ba39d77d95e972c0e3893811e4450277414bd7ef58156de385
gcr.io/paketo-community/python@sha256:db91e5ad8137a3bc5dadbd5c7dc0c2e0674bf2cfe4030e46913ce4fbb87d59cf
gcr.io/paketo-buildpacks/java@sha256:4d25003f3f6b181506754e78dadbdadbf19cb238190937f636df34ebaaa2c003
gcr.io/paketo-buildpacks/nodejs@sha256:e9c557eb4a9a8a8369aa4cbd025ec7fd80629b8b2305221795ac104890355ebc
gcr.io/paketo-buildpacks/go@sha256:9fe1ffc19db1d614dba2b7da7f9a072f2ae142cee0e3b79d72b24711b9c761a7
gcr.io/paketo-buildpacks/dotnet-core@sha256:db4757192910ba2bc104441b7a0dc455d0b5454ca7fa8f065ecbd0ef65b63565
gcr.io/paketo-buildpacks/php@sha256:70e756b3865670ad251057945989478365f50f7968009c8a248fa590de17b2a5
gcr.io/paketo-buildpacks/procfile@sha256:61b05ad4392efb66a61098e816099a76b95e037605ceb92bb509e137987fef1d
gcr.io/cf-build-service-public/kpack/build-init@sha256:94cdd9223310c2bbc6b9f10d17f754337d782f32ac1cd7de58d3e78746d5ab7c
gcr.io/cf-build-service-public/kpack/rebase@sha256:97f9f9c1f25b720401f18ad21a7f92ddb4ff80882333702586ac8f7d619c3e02
gcr.io/cf-build-service-public/kpack/lifecycle@sha256:fb7e0916ea429697630743b34e858c3555ddfbb5940683754dfccd3bfa446e0a
gcr.io/cf-build-service-public/kpack/completion@sha256:7b8b829ee21f6009ea9b580cad86fb2f74f28d2aa34676d4a130fb62b9fc9893
gcr.io/cf-build-service-public/kpack/controller@sha256:ec256da7e29eeecdd0821f499e754080672db8f0bc521b2fa1f13f6a75a04835
gcr.io/cf-build-service-public/kpack/webhook@sha256:ab9708d6be3f348fb89dda510525ee15f87716424a8c933f8620f074fd49b73a

Then I use kapp to deploy cf successfully. image

But when I try to push demo application, The app’s pod could not be initialized image

this app report an error as below:

OCI runtime create failed: container_linux.go:318: starting container process caused "exec: \"/cnb/lifecycle/detector\": stat /cnb/lifecycle/detector: permission denied": unknown

Expected behavior

Hope the demo application starts successfully

Additional context

Deploy instructions

kapp deploy -a cf -f ${TMP_DIR}/cf-for-k8s-rendered.yml -y

Cluster information

Huawei Cloud Kubernetes Cluster(v1.19.8) 5 nodes with 8v16G

CLI versions

  1. ytt --version: 0.33.0
  2. kapp --version: 0.36.0
  3. kubectl version: v1.19.8
  4. cf version: 7.2.0+be4a5ce2b.2020-12-10
cf-gitbot commented 3 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/178263358

The labels on this github issue will be updated when the story is started.

heycait commented 3 years ago

Hi @gongzhao2 can you confirm that the ClusterStore section of your rendered file is updated to point to your registry?

This part of the rendered.yml

apiVersion: kpack.io/v1alpha1
kind: ClusterStore
metadata:
  name: cf-buildpack-store
  annotations:
    kapp.k14s.io/change-group.kpack-resources: cf-for-k8s.cloudfoundry.org/kpack-resources
spec:
  sources:
  - image: gcr.io/paketo-buildpacks/ruby@sha256:6f765726dd17b0ba39d77d95e972c0e3893811e4450277414bd7ef58156de385
  - image: gcr.io/paketo-community/python@sha256:494b194a0a6d44d8e2116298038e5bb2f0a0892f4c80a372de97d211a0706dc4
  - image: gcr.io/paketo-buildpacks/java@sha256:54dba27a1fbb80c5f01238a286b7eeb2b3e06e400c14f4314350e3bdeefc267e
  - image: gcr.io/paketo-buildpacks/nodejs@sha256:59c498f9d0187311f299dcb5666db5c6c209105170bc81b4e3ef32de009d4539
  - image: gcr.io/paketo-buildpacks/go@sha256:29f3b58497cac958c18ec0646264d532de7c5b5505081b7e5f41366d009f34e3
  - image: gcr.io/paketo-buildpacks/dotnet-core@sha256:7633a3a242297c1120202d863027222150ae63db4af0a113cbf35e7a9be91bd2
  - image: gcr.io/paketo-buildpacks/php@sha256:70e756b3865670ad251057945989478365f50f7968009c8a248fa590de17b2a5
  - image: gcr.io/paketo-buildpacks/procfile@sha256:61b05ad4392efb66a61098e816099a76b95e037605ceb92bb509e137987fef1d
---
apiVersion: kpack.io/v1alpha1
kind: ClusterStack
metadata:
  name: bionic-stack
  annotations:
    kapp.k14s.io/change-group.kpack-resources: cf-for-k8s.cloudfoundry.org/kpack-resources
spec:
  id: io.buildpacks.stacks.bionic
  buildImage:
    image: index.docker.io/paketobuildpacks/build@sha256:f89d964e263e1ec513420f7071691dc9ce59e659b8e214da78eca3fdd3b68566
  runImage:
    image: index.docker.io/paketobuildpacks/run@sha256:faf38e1f928b4bc33ce3923ffd14bf812c528919acfefb94030a7a43563d8057
gongzhao2 commented 3 years ago

Yes, I can confirm that I replace every "gcr.io" image with my docker hub registry.

apiVersion: kpack.io/v1alpha1
kind: ClusterStore
metadata:
  name: cf-buildpack-store
  annotations:
    kapp.k14s.io/change-group.kpack-resources: cf-for-k8s.cloudfoundry.org/kpack-resources
spec:
  sources:
  - image: gongzhao2/paketo-buildpacks_ruby
  - image: gongzhao2/paketo-community_python
  - image: gongzhao2/paketo-buildpacks_java
  - image: gongzhao2/paketo-buildpacks_nodejs
  - image: gongzhao2/paketo-buildpacks_go
  - image: gongzhao2/paketo-buildpacks_dotnet-core
  - image: gongzhao2/paketo-buildpacks_php
  - image: gongzhao2/paketo-buildpacks_procfile
---
apiVersion: kpack.io/v1alpha1
kind: ClusterStack
metadata:
  name: bionic-stack
  annotations:
    kapp.k14s.io/change-group.kpack-resources: cf-for-k8s.cloudfoundry.org/kpack-resources
spec:
  id: io.buildpacks.stacks.bionic
  buildImage:
    image: index.docker.io/paketobuildpacks/build@sha256:42db6f300377ccb187bea21b7c333026ad529f0385f4a8a4d7223480f2672777
  runImage:
    image: index.docker.io/paketobuildpacks/run@sha256:ae23d1e8ab33e93807ef42663ebefd431f4cabd3b270317a73564d496546dd10

I think this might be caused by the difference bewteen Huawei Cloud's Kubernetes cluster and native Kubernetes cluster. Since I use same rendered template to deploy cf-for-k8s in minikube successfully. I will continue investigate this issue and update it.

heycait commented 3 years ago

Are you able to access the docker registry from the cluster?

That particular command /cnb/lifecycle/detector that is failing is part of kpack itself, so seems like something with the kpack configuration is not quite right.

Birdrock commented 3 years ago

What revision of cf-for-k8s are you using?

I was able to use your relocated buildpack images in a GKE cluster to deploy an application and it worked fine.

I'm not sure what interplay there is between your cloud provider, the Cloud Native Buildpacks lifecycle and the container runtime. Your error suggests that the runtime is unable to start the container. Do you know what runtime the Hauwei cluster is using?

gongzhao2 commented 3 years ago

I cluster can access docker hub registry. I just clone cf-for-k8s repository from main branch, I think it's v4.2.0 version. regarding runtime part, Huawei uses docker as the container runtime.

gongzhao2 commented 3 years ago

Hi there, I found the root cause is that Huawei Cloud has some secuirty enhancement about Kubernetes "securityContext". I check the POD in the cf-workloads-staging namespace, found that there are some configuration of securityContext

$ kubectl get pods -n cf-workloads-staging
NAME                                                           READY   STATUS                    RESTARTS   AGE
41bc205a-4bf2-44a9-82fc-9412812b7fd2-build-1-lqklw-build-pod   0/1     Init:ContainerCannotRun   0          118s
...
  securityContext:
    fsGroup: 1000
    runAsGroup: 1000
    runAsUser: 1000
...

How can I change the runAsUser or runAsGroup configuration, I can't found any related configuration in cf-for-k8s-rendered.yml . I think this pod is created by some CRD resources, how can I change the default behavior?

Birdrock commented 3 years ago

cf-for-k8s doesn't make any opinions about runAsUser or runAsGroup - the only requirement is that the images are run as non-root. In the case of source code built images, Kpack and CNBs provide a numeric user ID. In the case of Docker/OCI images, the user is responsible for ensuring that the USER directive is populated with a numeric. (The runtime is unable to determine whether a non-numeric user maps to root, so it requires a numeric.)

Workloads configuration is set by Eirini. Workloads & Jobs.

What does the describe status for the pod say? kubectl describe pod -n cf-workloads <pod-name>

gongzhao2 commented 3 years ago

Hi, Close this issue since I successfully deploy cf-for-k8s on native Kubernetes cluster. And the root couse is that Huawei Cloud have some secuirty enhancement which needs me run docker container as root. Thank you everyone!