openshift / jenkins

Apache License 2.0
260 stars 447 forks source link

default agents using registry.redhat.io #823

Closed Gl4di4torRr closed 5 years ago

Gl4di4torRr commented 5 years ago

I just deployed the latest v3.11 version of Jenkins on OpenShift and ran a quick pipeline for testing. Looks like the default agents are now using registry.redhat.io. Now my agents are failing to spin up because this requires a login to the container catalog. Looking at the docs, it seems like they should be pointing to access.redhat.com. Am I missing something or could this be a gap in the docs vs agents? Thanks!

Gl4di4torRr commented 5 years ago

I found https://access.redhat.com/RegistryAuthentication but doesn't this break the out-of-the-box functionality of jenkins? Maybe we should update the docs to use registry.redhat.io?

gabemontero commented 5 years ago

Yeah we recently updated the readme for this ( see https://github.com/openshift/cluster-samples-operator/blob/master/pkg/apis/samples/v1/types.go#L403-L405 ), but only in the 4.0 context there.

But the quick FYI is registry.redhat.access.com is going away in the pretty near future.

I honestly had totally forgot about https://github.com/openshift/jenkins/pull/665 from last summer

@bparees - I forget now, were there concerns about folks needing creds when we executed https://github.com/openshift/jenkins/pull/665? ... or did we just miss updating the docs to explain that?

Also, switching that ref to quay.io was not considered since it wasn't available yet and the jenkins based UBI available yet (though that stuff of course is not available for 3.x, just 4.x).... running the UBI with subscriptions/entitlements may provide some wiggle room (though I am not optimistic).

Minimally @Gl4di4torRr , yes, some more hits to the doc seem warranted, unless some @bparees and I sort out some kind of accommodation for 3.x re: the UBI and which image we reference in the config for the sample agent defs.

@openshift/sig-developer-experience @waveywaves fyi

bparees commented 5 years ago

But the quick FYI is registry.redhat.access.com is going away in the pretty near future

registry.redhat.access.com will probably be around for a long time but it will not get rhel8 based images.

@bparees - I forget now, were there concerns about folks needing creds when we executed #665? ... or did we just miss updating the docs to explain that?

the node is supposed to have creds, and k8s will use those creds when spinning up pods. The 3.11 install/upgrade is supposed to enforce that you provide those creds so they can be put on your nodes.

Looking at the docs, it seems like they should be pointing to access.redhat.com. Am I missing something or could this be a gap in the docs vs agents? Thanks!

they are correct to point to registry.redhat.io(the docs are wrong/outdated), the issue here is that your nodes(the container runtime on your nodes) seem to be missing the creds for registry.redhat.io.

gabemontero commented 5 years ago

@Gl4di4torRr - just to confirm, are you running the docker.io based jenkins-2 image on a centos based cluster without subscriptions? Or is this a rhel based cluster and missing cred as @bparees noted (perhaps pointing to a install/upgrade problem)?

And agreed @bparees, as I noted earlier , the README needs more fixes wrt this subject. Aside from a general explanation, the doc around the NODEJS_SLAVE_IMAGE is wrong, as there is no choice in the code between the docker.io image and the redhat registry one.

And thanks for the clarification of registry.redhat.access.com's life expectancy and the rhel8 nuance to this.

btw @Gl4di4torRr you could use that env var to switch the image to whatever you wanted.

Gl4di4torRr commented 5 years ago

@gabemontero @bparees I had a conversation with @etsauer and one of my issues is that I am running the 3.11 Jenkins on a 3.10 cluster. Yes, we have a RHEL subscription. This has come about because I am forced internally to remediate Jenkins plugin vulnerabilities which has caused me to upgrade Jenkins from 3.10 to 3.11. Running a quick test pipeline, I immediately hit this issue. I think the docs can be updated here, but just for your awareness, most consumers are probably not on 3.11 and if they rely on Red Hat for container health, they may hit this as well. This probably doesn't really concern you. Just letting you know.

gabemontero commented 5 years ago

ok thanks for the clarification @Gl4di4torRr

the 3.11 image on earlier clusters will become more of the norm given our stance on only addressing jenkins security advisories in the 3.11 stream

I'm going to reopen this to drive the doc update

gabemontero commented 5 years ago

Actually @Gl4di4torRr @bparees I might have a solution for the 3.11 image where, if running in an openshift pod, it can determine pretty easily that it is running on a pre 3.11 cluster, and switch from registry.redhat.io to registry.access.redhat.com accordingly, to support our "running the 3.11 image on older clusters to address jenkins security advisories" scenario.

Once I wrap up some more trials and unit test a little I'll have a PR up.

Some readme updates, with this nuance, are still appropriate.

Also, my claim in https://github.com/openshift/jenkins/issues/823#issuecomment-475707887 about the NODEJS_SLAVE_IMAGE doc being wrong is not entirely correct. There is logic in 3.11 (though not 4.0) for picking between docker.io and registry.* based on the centos vs. rhel distinction.

Gl4di4torRr commented 5 years ago

@gabemontero thanks for the info. I'll be looking out for the PR to see the changes. Feel free to hit me up if you want some extra testing on my end.

gabemontero commented 5 years ago

sounds good @Gl4di4torRr .... wrt testing, for the rhel based images, with the various evolutions internally within openshift, I won't be able to get you a rhel based image until after this PR merges. But I should be able to get you a pre-release version of a 3.11.x openshift/jenkins rhel image for you to try if you like.

In my internal testing, I'm building the centos image but am forcing it down the new path as if it was a rhel image.

gabemontero commented 5 years ago

also note, while the change in the startup scripts will only go into 3.11, I plan on making README udpate to both openshift-3.11 and master branches.

bparees commented 5 years ago

it can determine pretty easily that it is running on a pre 3.11 cluster, and switch from registry.redhat.io to registry.access.redhat.com accordingly, to support our "running the 3.11 image on older clusters to address jenkins security advisories" scenario.

where are you going to intercept and make this change? Just in the default configuration setup?

My concern is what if someone on a pre 3.11 cluster explicitly wants to use registry.redhat.io? (especially in the future when some rhel8 image is only available there)? As long as you're only changing the initial config defaulting that's ok.

gabemontero commented 5 years ago

yep only changing the initial config defaulting @bparees

gabemontero commented 5 years ago

it is a few levels deep, but ultimately the call to generate the k8s plugin config is gated by if [ ! -e ${JENKINS_HOME}/configured ]; up top in the s2i run script

Gl4di4torRr commented 5 years ago

@gabemontero testing out your changes. I am use to have a build config and a deployment config in whatever the repo is to test out the changes. What are your thoughts on changing the openshift directory to openshift/build and openshift/deploy so it's easy to build the image on a cluster and deploy that image? Getting out of scope of this project? Thanks!

gabemontero commented 5 years ago

post your proposed build config yaml @Gl4di4torRr and I'll comment on that.

fyi ... checked this AM and there is not a 3.11 RHEL image based on these changes yet in our internal system.

Gl4di4torRr commented 5 years ago

@gabemontero here is my build config. I created a namespace called chrisbolton and built against the openshift-3.11 branch and had to update the jenkins-ephemeral.json to point to my namespace and tag latest instead of 2.

  "apiVersion": "v1",
  "kind": "Template",
  "metadata": {
    "annotations": {
      "openshift.io/display-name": "Jenkins (Ephemeral)",
      "description": "Jenkins service, without persistent storage.\n\nWARNING: Any data stored will be lost upon pod destruction. Only use this template for testing.",
      "iconClass": "icon-jenkins",
      "tags": "instant-app,jenkins",
      "openshift.io/long-description": "This template deploys a Jenkins server capable of managing OpenShift Pipeline builds and supporting OpenShift-based oauth login.  The Jenkins configuration is stored in non-persistent storage, so this configuration should be used for experimental purposes only.",
      "openshift.io/provider-display-name": "Red Hat, Inc.",
      "openshift.io/documentation-url": "https://docs.openshift.org/latest/using_images/other_images/jenkins.html",
      "openshift.io/support-url": "https://access.redhat.com"
    },
    "name": "jenkins"
  },
  "labels": {
    "app": "jenkins",
    "template": "jenkins-build-template"
  },
  "objects": [
    {
      "apiVersion": "v1",
      "kind": "ImageStream",
      "metadata": {
        "labels": {
          "app": "jenkins"
        },
        "name": "jenkins",
        "namespace": "${NAMESPACE}"
      }
    },
    {
      "kind": "RoleBinding",
      "apiVersion": "v1",
      "metadata": {
          "name": "${JENKINS_SERVICE_NAME}_edit"
      },
      "groupNames": null,
      "subjects": [
          {
              "kind": "ServiceAccount",
              "name": "${JENKINS_SERVICE_NAME}"
          }
      ],
      "roleRef": {
          "name": "edit"
      }
    },
    {
      "apiVersion": "v1",
      "kind": "BuildConfig",
      "metadata": {
        "labels": {
          "app": "jenkins"
        },
        "name": "jenkins",
        "namespace": "${NAMESPACE}"
      },
      "spec": {
        "output": {
          "to": {
            "kind": "ImageStreamTag",
            "name": "${JENKINS_IMAGE_STREAM_TAG}",
            "namespace": "${NAMESPACE}"
          }
        },
        "resources": {
          "limits": {
            "memory": "${MEMORY_LIMIT}"
          }
        },
        "source": {
          "contextDir": "${CONTEXT_DIR}",
          "git": {
            "uri": "${SOURCE_REPOSITORY_URL}",
            "ref": "${SOURCE_REPOSITORY_REF}"
          },
          "secrets": [

          ],
          "type": "Git"
        },
        "strategy": {
          "dockerStrategy": {
            "dockerfilePath": "Dockerfile"
          },
          "type": "Source"
        },
        "triggers": [
          {
            "type": "ConfigChange"
          },
          {
            "imageChange": {
            },
            "type": "ImageChange"
          }
        ]
      },
      "status": {
        "lastVersion": 0
      }
    }
  ],
  "parameters": [
    {
      "name": "JENKINS_SERVICE_NAME",
      "displayName": "Jenkins Service Name",
      "description": "The name of the OpenShift Service exposed for the Jenkins container.",
      "value": "jenkins"
    },
    {
      "description": "Git source URI for Jenkins",
      "name": "SOURCE_REPOSITORY_URL",
      "required": true,
      "value": "https://github.com/openshift/jenkins.git"
    },
    {
      "description": "Git branch/tag reference",
      "name": "SOURCE_REPOSITORY_REF",
      "value": "openshift-3.11"
    },
    {
      "description": "Git branch/tag reference",
      "name": "CONTEXT_DIR",
      "value": "2"
    },
    {
      "description": "Maximum amount of memory the container can use.",
      "displayName": "Memory Limit",
      "name": "MEMORY_LIMIT",
      "value": "2Gi"
    },
    {
      "description": "Name of the ImageStreamTag to be used for the Jenkins image.",
      "displayName": "Jenkins ImageStreamTag",
      "name": "JENKINS_IMAGE_STREAM_TAG",
      "value": "jenkins:latest"
    },
    {
      "description": null,
      "displayName": "namespace",
      "name": "NAMESPACE",
      "required": true,
      "value": "chrisbolton"
    }
  ]
}
Gl4di4torRr commented 5 years ago

@gabemontero when building the Dockerfile.rhel7, I am getting

Error: Package: 1:java-1.8.0-openjdk-headless-1.8.0.201.b09-0.el7_6.x86_64 (rhel-7-server-rpms)
           Requires: pcsc-lite-devel(x86-64)

Maybe this relates to this commit being back ported? https://github.com/openshift/jenkins/commit/11f1fd5c489f4e93d1fc0b3764c09b5759402e73

I can also just wait for y'all to build a rhel image so i'm not annoying :)

gabemontero commented 5 years ago

Hey @Gl4di4torRr ... first, that commit you noted is 4.0 only and is unrelated

Next, the way your bc is constructed, it actually is going to access the Dockerfile and not the Dockerfile.rhel7.

Change "dockerfilePath": "Dockerfile" to "dockerfilePath": "Dockerfile.rhel7"

Then, assuming your builds are running on an entitled node, you should be able to build a rhel7 based image I would think.

If that doesn't work, then yeah wait until I can give you access to our internally built image.

Gl4di4torRr commented 5 years ago

@gabemontero do know where this channel comes from yum-config-manager --enable rhel-7-server-ose-onlineint-rpms? My guess is that I don't have access to all the packages needed to build on a rhel7 vm.

gabemontero commented 5 years ago

ah yeah that is an internal one I think ... don't think I can point you to that

yeah, either fork our repo, modify the 3.11 Dockerfile.rhel7 to disable those repos like the Dockerfile.localdev does in our master branch, and try again with your BC pointing to your foked repo,

or just want until I can provide an internally built image :-)

Gl4di4torRr commented 5 years ago

@gabemontero gotcha! So I think my comment above about having a build config for this repo won't work :( I'll just wait for the rhel image and test then :) Thanks!

gabemontero commented 5 years ago

quick update still don't have a new 3.11 build since Mar 25 (then changed merged Mar 27)

older releases have been higher in the queue that last few days

Gl4di4torRr commented 5 years ago

@gabemontero j/w do you have an external image registry where I can see the latest development rhel images?

gabemontero commented 5 years ago

Sorry @Gl4di4torRr it is internal only ... I'm going to have to pull it, then push it to my docker.io account in order for you to get at it

and it is looking more and more like we won't get a new image until the current 3.11 errata content gets out (which it has not yet)

Gl4di4torRr commented 5 years ago

@gabemontero do you have an image now?

gabemontero commented 5 years ago

good timing @Gl4di4torRr ... just looked and we got one last night

I've pushed it to docker.io/gmontero/jenkins-311-rhel-for-chris-bolton-to-try:latest

let me know the results you have

Gl4di4torRr commented 5 years ago

@gabemontero okay, i ran your image and went through your PR.. Here is my output from inside the container.

sh-4.2$ oc version
oc v3.11.106
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://private-ip:443
openshift v3.10.127
kubernetes v1.10.0+b81c8f8
sh-4.2$ export version=$(oc version)
sh-4.2$ export countForInPodCheck=`echo $versions | awk -F"v3" '{print NF-1}'`
sh-4.2$ export countForV311Check=`echo $versions | awk -F"v3.11" '{print NF-1}'`
sh-4.2$ echo $version
oc v3.11.106 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://private-ip:443 openshift v3.10.127 kubernetes v1.10.0+b81c8f8
sh-4.2$ echo $countForInPodCheck
-1
sh-4.2$ echo $countForV311Check
-1

Looking at the config.xml:

sh-4.2$ cat config.xml | grep maven
          <name>maven</name>
          <label>maven</label>
              <image>registry.redhat.io/openshift3/jenkins-agent-maven-35-rhel7:v3.11</image>

Personally, I was expecting this to be the registry.access.redhat.com registry. Are you expecting me to change the oc client version?

gabemontero commented 5 years ago

@Gl4di4torRr hmmmmm ... the values of countForInPodCheck and countForV311Check are not what I would have expected .... -1 vs. 2 ... and not in line with my testing (albeit it was with either centos or UBI without subscriptions)

Did you run the image on a node with subscriptions? ... perhaps that explains the difference in behavior ?

I'll have to start experimenting.

To your question, in general, we recommend keeping the oc version and cluster version in sync, but I was trying to allow the 3.11/3.10 sort of combo.

gabemontero commented 5 years ago

yeah I brought the image up and I've confirmed that echo $versions | awk -F"v3.11" '{print NF-1}' works differently on fedora/centos than it does on rhel

swell

iterating on new bash now

gabemontero commented 5 years ago

Looks like echo $versions | grep -o "v3" | wc -l echo $versions | grep -o "v3.11" | wc -l

@Gl4di4torRr - could you redo your experiment and use those forms instead of the awk form, and let me know what you get ?

I'm trying them on my local OKD clusters now

gabemontero commented 5 years ago

Yep they seemed to work for me on both my 3.10/3.11 clusters.

I've updated my image for you @Gl4di4torRr with the change to echo $versions | grep -o "v3" | wc -l etc.

Let me know if you can give it go ... thanks

Gl4di4torRr commented 5 years ago

@gabemontero well your logic is right but it still set the registry.redhat.io image. Let me do a little more digging and get back to you.

sh-4.2$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.6 (Maipo)
sh-4.2$ versions=`oc version`
sh-4.2$ echo $versions
oc v3.11.106 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://10.26.0.1:443 openshift v3.10.127 kubernetes v1.10.0+b81c8f8
sh-4.2$ countForInPodCheck=`echo $versions | grep -o "v3" | wc -l`
sh-4.2$ echo $countForInPodCheck
2
sh-4.2$ countForV311Check=`echo $versions | grep -o "v3.11" | wc -l`
sh-4.2$ echo $countForV311Check
1
sh-4.2$ cat /var/lib/jenkins/config.xml | grep maven
          <name>maven</name>
          <label>maven</label>
              <image>registry.redhat.io/openshift3/jenkins-agent-maven-35-rhel7:v3.11</image>
Gl4di4torRr commented 5 years ago

@gabemontero okay, i just tested this on our second lab cluster.

I still got failed expected results:

sh-4.2$ cat /var/lib/jenkins/config.xml | grep maven
          <name>maven</name>
          <label>maven</label>
              <image>registry.redhat.io/openshift3/jenkins-agent-maven-35-rhel7:v3.11</image>

I ensured your latest commit made it into the container I am running:

sh-4.2$ cat kube-slave-common.sh | grep "grep -o"
  countForInPodCheck=`echo $versions | grep -o "v3" | wc -l`
  countForV311Check=`echo $versions | grep -o "v3.11" | wc -l`

However, I ran the same commands as you inside the terminal, and got the expected results.

sh-4.2$ versions=`oc version`
sh-4.2$ echo $versions
oc v3.11.106 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://10.26.0.1:443 openshift v3.10.119 kubernetes v1.10.0+b81c8f8
sh-4.2$ echo "OpenShift client and server versions are ${versions}"
OpenShift client and server versions are oc v3.11.106
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://10.26.0.1:443
openshift v3.10.119
kubernetes v1.10.0+b81c8f8
sh-4.2$ countForInPodCheck=`echo $versions | grep -o "v3" | wc -l`
sh-4.2$ echo ${countForInPodCheck}
2

I tried look through your shell scripts for a few minutes. Could one of your other default configs be overriding this change?

gabemontero commented 5 years ago

It is working correctly for me @Gl4di4torRr on both my 3.10 and 3.11 clusters.

Is it possible that you are testing on a persistent volume?

The way the image works, the image only attempt to configure those sample k8s plugin pod templates the first time it comes up on a given volume. It then sets a file as a marker to check on subsequent starts.

So if it was set up with the problematic use of awk, it won't retry with the new logic.

If you are using PVs, see if you can recycle them. Or maybe try short term using the template that leverages ephemeral storage.

gabemontero commented 5 years ago

sorry revert my last comment ... my 3.10 test IS still showing registry.redhat.io

I'm investigating

gabemontero commented 5 years ago

had a bad env variable check @Gl4di4torRr

try docker.io/gmontero/jenkins-311-test:latest (I was having trouble pulling that other image ... plus we had a 3.11 rebuild here recently)

will be updating https://github.com/openshift/jenkins/pull/839 shortly

gabemontero commented 5 years ago

still consider the PV point I made earlier if those are in play for you @Gl4di4torRr

Gl4di4torRr commented 5 years ago

@gabemontero yeah, when testing I basically make sure I do an oc delete all --all and oc delete pvc jenkins if I am using a persistent jenkins. I'll give your new image a go now.

Gl4di4torRr commented 5 years ago

@gabemontero

Success!!!

I am pretty sure this is an open source success story :) even though I am not allowed to represent my company in the open source world lol.

$ cat /var/lib/jenkins/config.xml | grep maven
          <name>maven</name>
          <label>maven</label>
              <image>registry.access.redhat.com/openshift3/jenkins-agent-maven-35-rhel7:v3.11</image>
Screen Shot 2019-04-23 at 4 32 06 PM
gabemontero commented 5 years ago

Great news @Gl4di4torRr

And a covert "absolutely" on your characterization :-)