shipwright-io / build

Shipwright - a framework for building container images on Kubernetes
https://shipwright.io
Apache License 2.0
645 stars 110 forks source link

Troubleshooting: running baseline Buildpacks-v3 build on openshift #350

Closed jaideepr97 closed 4 years ago

jaideepr97 commented 4 years ago

Hey folks! I'm super new to the build-api operator, and I was poking around with it to understand it better Specifically - I'm trying to run a simple buildpacks-v3 build on openshift by following the steps outlined here

It creates a builder pod, with a bunch of init and normal containers, but what I'm seeing is they all terminate very quickly ( most with an exit code 0) The pod moves to completed state but each container says failed individually on clicking on it, and no image is pushed to the internal registry I also don't see any logs during runtime or errors so I'm not sure how to go about debugging the issue It's likely that I'm just missing something but would appreciate any inputs on this! Please let me know if you need any other information from me

Steps I followed:

build.yaml:

apiVersion: build.dev/v1alpha1
kind: Build
metadata:
  name: buildpack-nodejs-build
spec:
  source:
    url: https://github.com/sclorg/nodejs-ex
  strategy:
    name: buildpacks-v3
    kind: ClusterBuildStrategy
  output:
    image: image-registry.openshift-image-registry.svc:5000/shipwright-test/buildpacks

buildRun.yaml:

apiVersion: build.dev/v1alpha1
kind: BuildRun
metadata:
  name: buildpack-nodejs-buildrun
spec:
  buildRef:
    name: buildpack-nodejs-build
jaideepr97 commented 4 years ago

output of oc describe pod here if it helps

cc @sbose78

sbose78 commented 4 years ago

Does the buildrun object have a status with anything descriptive?

jaideepr97 commented 4 years ago

Does the buildrun object have a status with anything descriptive?

Status just shows completed i've attached the oc describe output as well, but I don't think I saw any errors in there either

sbose78 commented 4 years ago

Great, so the build was successful and was pushed to the internal registry. Were you able to use this?

jaideepr97 commented 4 years ago

Great, so the build was successful and was pushed to the internal registry. Were you able to use this?

I don't think it was actually successful There's no image stream getting created at the end of pod execution

sbose78 commented 4 years ago

Are you running this on Openshift?

Note, this is a Kubernetes native project, so an imagestream wouldn't get created. Rather an image would get pushed to the internal registry without necessarily having a representation of the same as an imagestream.

Can you try deploying the image using the image Registry URL?

sbose78 commented 4 years ago

Or if you had an imagestream already, and use the name of the equivalent full qualified image URL, your imagestream would be usable after the build.

jaideepr97 commented 4 years ago

I tried creating an image stream beforehand but there were no images pushed to it that I can see trying to use image-registry.openshift-image-registry.svc:5000/shipwright-test/buildpacks directly from the console doesn't work either because it expects an imagestream tag if its internal, and doesn't recognize this url if you put it under external

qu1queee commented 4 years ago

@jaideepr97 can you provide the logs of the following commands:

kubectl -n test-build get br
kubectl -n test-build get tr

where br is the shortname of "buildrun" and tr stands for "taskrun" . This should provide you with a more ad-hoc error. I followed your steps, what I found was the following:

$ k -n test-build get br
NAME                        SUCCEEDED   REASON   STARTTIME   COMPLETIONTIME
buildpack-nodejs-buildrun   False       "step-step-export" exited with code 246 (image: "gcr.io/paketo-buildpacks/builder@sha256:d2aa288ecf6f2a5e1b76f4e2b5aa660d9a53fae45687f1a66492f63bdbee3b1a"); for logs run: kubectl -n test-build logs buildpack-nodejs-buildrun-2kwq7-pod-5sz78 -c step-step-export
                            3m9s        2m18s
$ k -n test-build get tr
NAME                              SUCCEEDED   REASON   STARTTIME   COMPLETIONTIME
buildpack-nodejs-buildrun-2kwq7   False       Failed   3m11s       2m20s
$ kubectl -n test-build logs buildpack-nodejs-buildrun-2kwq7-pod-5sz78 -c step-step-export
Warning: Warning: analyzed TOML file not found at 'analyzed.toml'
Adding layer 'paketo-buildpacks/node-engine:node'
Adding layer 'paketo-buildpacks/npm:modules'
Adding 1/1 app layer(s)
Adding layer 'launcher'
Adding layer 'config'
Adding label 'io.buildpacks.lifecycle.metadata'
Adding label 'io.buildpacks.build.metadata'
Adding label 'io.buildpacks.project.metadata'
*** Images (sha256:293ed26ee408ed94b69c87eb149b75945d94a006e69331e352c8650750c5f486):
      image-registry.openshift-image-registry.svc:5000/build-examples/taxi-app - Get "https://image-registry.openshift-image-registry.svc:5000/v2/": dial tcp: lookup image-registry.openshift-image-registry.svc on 172.21.0.10:53: no such host
ERROR: failed to export: failed to write image to the following tags: [image-registry.openshift-image-registry.svc:5000/build-examples/taxi-app: Get "https://image-registry.openshift-image-registry.svc:5000/v2/": dial tcp: lookup image-registry.openshift-image-registry.svc on 172.21.0.10:53: no such host]

so is a problem related to the registry. You might want to use another registry endpoint.

sbose78 commented 4 years ago

You might want to use another registry endpoint.

Could you link to some docs to do that, please ?

qu1queee commented 4 years ago

@sbose78 let me see, if not I might generate them.

qu1queee commented 4 years ago

@jaideepr97 if you want to use a registry like docker hub, you will need to generate a secret, to do this please follow this steps:

  1. Create secret on the namespace where you apply Builds/BuildRuns:
kubectl create secret docker-registry <SECRET_NAME> --docker-server=https://index.docker.io/v1/ --docker-username=<YOUR_DOCKER_HUB_USER> --docker-password=<YOUR_DOCKER_HUB_PASSWORD> --docker-email=me@here.com
  1. Reference the secret and the docker hub repository, as follows:
---
apiVersion: build.dev/v1alpha1
kind: Build
metadata:
  name: buildpack-nodejs-build
  annotations:
    build.build.dev/build-run-deletion: "false"
spec:
  source:
    url: https://github.com/sclorg/nodejs-ex
  strategy:
    name: buildpacks-v3
    kind: ClusterBuildStrategy
  output:
    image: docker.io/<DOCKERHUB_REPOSITORY>/mx:latest
    credentials:
      name: <SECRET_NAME>

I tried this now, you can see my new image is build in https://hub.docker.com/repository/docker/eeeoo/mx

jaideepr97 commented 4 years ago

@qu1queee Thanks for the feedback

➜  buildpacks-test kubectl -n buildpacks-test get br  
NAME                        SUCCEEDED   REASON                                                                                                                                                                                                                                                                     STARTTIME   COMPLETIONTIME
buildpack-nodejs-buildrun   False       "step-step-export" exited with code 246 (image: "gcr.io/paketo-buildpacks/builder@sha256:e9f52ef41a60c6668bcfdecb78e6c0fa2f10edd8ad2ca617b64c317cfe10860c"); for logs run: kubectl -n buildpacks-test logs buildpack-nodejs-buildrun-kzj65-pod-rk7qd -c step-step-export   11m         9m21s
➜  buildpacks-test kubectl -n buildpacks-test get tr
NAME                              SUCCEEDED   REASON   STARTTIME   COMPLETIONTIME
buildpack-nodejs-buildrun-kzj65   False       Failed   11m         9m27s
➜  buildpacks-test kubectl -n buildpacks-test logs buildpack-nodejs-buildrun-kzj65-pod-rk7qd -c step-step-export      
Warning: Warning: analyzed TOML file not found at 'analyzed.toml'
Adding layer 'paketo-buildpacks/node-engine:node'
Adding layer 'paketo-buildpacks/npm:modules'
Adding 1/1 app layer(s)
Adding layer 'launcher'
Adding layer 'config'
Adding label 'io.buildpacks.lifecycle.metadata'
Adding label 'io.buildpacks.build.metadata'
Adding label 'io.buildpacks.project.metadata'
*** Images (sha256:c982252df98c9a5dd5110fac7a525eb9313995e5a4c53d9d5fa84f8b27d675e1):
      image-registry.openshift-image-registry.svc:5000/buildpacks-test/buildpacks - Get "https://image-registry.openshift-image-registry.svc:5000/v2/": x509: certificate signed by unknown authority
ERROR: failed to export: failed to write image to the following tags: [image-registry.openshift-image-registry.svc:5000/buildpacks-test/buildpacks: Get "https://image-registry.openshift-image-registry.svc:5000/v2/": x509: certificate signed by unknown authority]

It does seem to be a failing at the export stage for me as well, but it looks like a signature issue If I understand it correctly, the build-api operator uses a pipeline-token (I assume this is from a tekton object) internally to try to push to the openshift internal registry, could this be where the issue is? @sbose78

jaideepr97 commented 4 years ago

@jaideepr97 if you want to use a registry like docker hub, you will need to generate a secret, to do this please follow this steps:

  1. Create secret on the namespace where you apply Builds/BuildRuns:
kubectl create secret docker-registry <SECRET_NAME> --docker-server=https://index.docker.io/v1/ --docker-username=<YOUR_DOCKER_HUB_USER> --docker-password=<YOUR_DOCKER_HUB_PASSWORD> --docker-email=me@here.com
  1. Reference the secret and the docker hub repository, as follows:
---
apiVersion: build.dev/v1alpha1
kind: Build
metadata:
  name: buildpack-nodejs-build
  annotations:
    build.build.dev/build-run-deletion: "false"
spec:
  source:
    url: https://github.com/sclorg/nodejs-ex
  strategy:
    name: buildpacks-v3
    kind: ClusterBuildStrategy
  output:
    image: docker.io/<DOCKERHUB_REPOSITORY>/mx:latest
    credentials:
      name: <SECRET_NAME>

I tried this now, you can see my new image is build in https://hub.docker.com/repository/docker/eeeoo/mx

Yess, I'd tried this earlier, but I was getting some kind of string error when I tried to reference my created secret However, just tried it again and it worked perfectly Thanks a lot @qu1queee!

qu1queee commented 4 years ago

Regarding

If I understand it correctly, the build-api operator uses a pipeline-token (I assume this is from a tekton object) internally to try to push to the openshift internal registry, could this be where the issue is?

I dont have an openshift running to test the above. As long as you dont use https://github.com/shipwright-io/build/blob/master/samples/buildrun/buildrun_buildpacks-v3_cr.yaml#L11 to autogenerate the SA, the controller will look for the pipeline SA. I understand the pipeline SA should have everything to speak to the openshift registry.

cmoulliard commented 1 year ago

Even if we create the docker-registry secret to access the local docker registry and pass it to the SA used by the buildRun, the following step will fail as the CA certificat is not mounted to the pod

step-build-and-push
...
4 of 11 buildpacks participating
paketo-buildpacks/ca-certificates 3.6.1
paketo-buildpacks/node-engine     1.5.0
paketo-buildpacks/npm-install     1.1.0
paketo-buildpacks/node-start      1.0.7
===> ANALYZING
Warning: Platform requested deprecated API '0.4'
ERROR: failed to initialize analyzer: getting previous image: connect to repo store "kind-registry:5000/snowdrop/sample-nodejs:latest": Get "https://kind-registry:5000/v2/": x
509: certificate signed by unknown authority

Any idea about how to provide such a CA certificate to trust to access the host kind-registry ? @otaviof

otaviof commented 1 year ago

Any idea about how to provide such a CA certificate to trust to access the host kind-registry ? @otaviof

@cmoulliard 👋, I think we need to use Tekton's SSL_CERT_DIR environment variable setting the location where the additional CA certificates are mounted.

OpenShift ImageStreams relies on setting the label config.openshift.io/inject-trusted-cabundle="true" to mount the additional certificates, and later on picked up by the ca-certificates Buildpack via the SSL_CERT_DIR environment variable.

However, for a KinD internal Container Registry I'd recommend using plain HTTP protocol, which would avoid having extra settings for local development.

cmoulliard commented 1 year ago

I think we need to use Tekton's SSL_CERT_DIR environment variable setting the location where the additional CA certificates are mounted.

Is it supported by shipwright or will require a change to set some env vars within the BuildRun CR ?

cmoulliard commented 1 year ago

However, for a KinD internal Container Registry I'd recommend using plain HTTP protocol, which would avoid having extra settings for local development.

This problem is not specific to the kind cluster but instead when we use a private/secured container registry: docker registry or docker registry running as pod, harbor, JFrog Container Registry, etc where https + self signed certificate will be used.

cmoulliard commented 1 year ago

@cmoulliard 👋, I think we need to use Tekton's SSL_CERT_DIR environment variable setting the location where the additional CA certificates are mounted.

I patched the ConfigMap "config-registry-cert -n tekton-pipelines" with the selfsigned certificate

data:
  cert: |
    -----BEGIN CERTIFICATE-----
    MIIGnjCCBIagAwIBAgIJAKH/GcLRUMghMA0GCSqGSIb3DQEBCwUAMGoxCzAJBgNV
    ...
    ovmhMFiet2HN06cck3K3h9Xc
    -----END CERTIFICATE-----

but the buildpack step still generate

===> ANALYZING
Warning: Platform requested deprecated API '0.4'
ERROR: failed to initialize analyzer: getting previous image: connect to repo store "kind-registry:5000/snowdrop/sample-nodejs:latest": Get "https://kind-registry:5000/
v2/": x509: certificate signed by unknown authority

This is certainly due to the fact that the certificate is not mounted as volume with the TaskRun

  - args:
    - -wait_file
    - /tekton/run/1/out
    - -post_file
    - /tekton/run/2/out
    - -termination_path
    - /tekton/termination
    - -step_metadata_dir
    - /tekton/run/2/status
    - -docker-config=registry-creds
    - -results
    - shp-image-digest,shp-image-size,shp-error-message,shp-error-reason,shp-source-default-commit-sha,shp-source-default-commit-author,shp-source-default-branch-name
    - -entrypoint
    - /bin/bash
    - --
    - -c
    - "set -euo pipefail\n\necho \"> Processing environment variables...\"\nENV_DIR=\"/platform/env\"\n\nenvs=($(env))\n\n#
      Denying the creation of non required files from system environments.\n# The
      creation of a file named PATH (corresponding to PATH system environment)\n#
      caused failure for python source during pip install (https://github.com/Azure-Samples/python-docs-hello-world)\nblock_list=(\"PATH\"
      \"HOSTNAME\" \"PWD\" \"_\" \"SHLVL\" \"HOME\" \"\")\n\nfor env in \"${envs[@]}\";
      do\n  blocked=false\n\n  IFS='=' read -r key value string <<< \"$env\"\n\n  for
      str in \"${block_list[@]}\"; do\n    if [[ \"$key\" == \"$str\" ]]; then\n      blocked=true\n
      \     break\n    fi\n  done\n\n  if [ \"$blocked\" == \"false\" ]; then\n    path=\"${ENV_DIR}/${key}\"\n
      \   echo -n \"$value\" > \"$path\"\n  fi\ndone\n\nLAYERS_DIR=/tmp/layers\nCACHE_DIR=/tmp/cache\n\nmkdir
      \"$CACHE_DIR\" \"$LAYERS_DIR\"\n\nfunction anounce_phase {\n  printf \"===>
      %s\\n\" \"$1\" \n}\n\nanounce_phase \"DETECTING\"\n/cnb/lifecycle/detector -app=\"${PARAM_SOURCE_CONTEXT}\"
      -layers=\"$LAYERS_DIR\"\n\nanounce_phase \"ANALYZING\"\n/cnb/lifecycle/analyzer
      -layers=\"$LAYERS_DIR\" -cache-dir=\"$CACHE_DIR\" \"${PARAM_OUTPUT_IMAGE}\"\n\nanounce_phase
      \"RESTORING\"\n/cnb/lifecycle/restorer -cache-dir=\"$CACHE_DIR\"\n\nanounce_phase
      \"BUILDING\"\n/cnb/lifecycle/builder -app=\"${PARAM_SOURCE_CONTEXT}\" -layers=\"$LAYERS_DIR\"\n\nexporter_args=(
      -layers=\"$LAYERS_DIR\" -report=/tmp/report.toml -cache-dir=\"$CACHE_DIR\" -app=\"${PARAM_SOURCE_CONTEXT}\")\ngrep
      -q \"buildpack-default-process-type\" \"$LAYERS_DIR/config/metadata.toml\" ||
      exporter_args+=( -process-type web ) \n\nanounce_phase \"EXPORTING\"\n/cnb/lifecycle/exporter
      \"${exporter_args[@]}\" \"${PARAM_OUTPUT_IMAGE}\"\n\n# Store the image digest\ngrep
      digest /tmp/report.toml | tr -d ' \\\"\\n' | sed s/digest=// > \"/tekton/results/shp-image-digest\"\n"
    command:
    - /tekton/bin/entrypoint
    env:
    - name: CNB_PLATFORM_API
      value: "0.4"
    - name: PARAM_SOURCE_CONTEXT
      value: /workspace/source/source-build
    - name: PARAM_OUTPUT_IMAGE
      value: kind-registry:5000/snowdrop/sample-nodejs:latest
    image: kind-registry:5000/paketobuildpacks/builder:base
    imagePullPolicy: Always
    name: step-build-and-push
    resources:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 250m
        memory: 65Mi
    securityContext:
      runAsGroup: 1000
      runAsUser: 1000
    terminationMessagePath: /tekton/termination
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /platform/env
      name: platform-env
    - mountPath: /workspace/source
      name: ws-t6zhm
    - mountPath: /tekton/creds
      name: tekton-creds-init-home-2
    - mountPath: /tekton/run/0
      name: tekton-internal-run-0
      readOnly: true
    - mountPath: /tekton/run/1
      name: tekton-internal-run-1
      readOnly: true
    - mountPath: /tekton/run/2
      name: tekton-internal-run-2
    - mountPath: /tekton/bin
      name: tekton-internal-bin
      readOnly: true
    - mountPath: /workspace
      name: tekton-internal-workspace
    - mountPath: /tekton/home
      name: tekton-internal-home
    - mountPath: /tekton/results
      name: tekton-internal-results
    - mountPath: /tekton/steps
      name: tekton-internal-steps
      readOnly: true
    - mountPath: /tekton/creds-secrets/registry-creds
      name: tekton-internal-secret-volume-registry-creds-65ljg
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-vhm2j
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: registry-creds
cmoulliard commented 1 year ago

Here is what I changed part of the clusterBuilderstrategy to:

https://github.com/redhat-buildpacks/testing/blob/main/k8s/shipwright/clusterbuildstrategy.yml#L447-L519

There is nevertheless still an issue as I continue to get the X509 error :-(

The pack build command works very well

pack build kind-registry:5000/demo:0.0.1-SNAPSHOT --env SERVICE_BINDING_ROOT=/platform/bindings --volume $PWD/k8s/binding/ca-certificates/:/platform/bindings/my-certificates  --builder=docker.io/paketobuildpacks/builder:base --path quarkus-petclinic
...
[exporter] Saving kind-registry:5000/demo:0.0.1-SNAPSHOT...
[exporter] *** Images (b714643ebe54):
[exporter]       kind-registry:5000/demo:0.0.1-SNAPSHOT
[exporter] Adding cache layer 'paketo-buildpacks/bellsoft-liberica:jdk'
[exporter] Adding cache layer 'paketo-buildpacks/syft:syft'
[exporter] Adding cache layer 'paketo-buildpacks/maven:application'
[exporter] Adding cache layer 'paketo-buildpacks/maven:cache'
[exporter] Adding cache layer 'buildpacksio/lifecycle:cache.sbom'
Successfully built image kind-registry:5000/demo:0.0.1-SNAPSHOT