openshift / origin

Conformance test suite for OpenShift
http://www.openshift.org
Apache License 2.0
8.49k stars 4.7k forks source link

Failed to resolve ImageStreamTag in deployment #6934

Closed mfojtik closed 8 years ago

mfojtik commented 8 years ago

Reproduce on latest master:

$ oc new-project stage
$ oc create -f https://raw.githubusercontent.com/arilivigni/openshift-ci-pipeline/master/sample-app/sample-app-template.json
$ oc start-build sample-app --wait --follow
$ oc deploy sample-app --latest

What I see, is this error in the Pod event log:

Failed to pull image "sample-app:latest": Error: image library/sample-app:latest not found

Additionally:

$ oc describe istag/sample-app:latest
Name:       sha256:79bd15f334544d39d726c1b5e178085bb48425a9a7d4dfbf2dc4d376cb192e24
Created:    12 minutes ago
Docker Image:   172.30.167.193:5000/stage/sample-app@sha256:79bd15f334544d39d726c1b5e178085bb48425a9a7d4dfbf2dc4d376cb192e24
...

$ oc describe dc/sample-app
Name:       sample-app
Created:    15 minutes ago
Labels:     app=sample-app
Annotations:    openshift.io/generated-by=OpenShiftNewApp
Latest Version: 2
Triggers:   Config, Image(sample-app@latest, auto=false)
Strategy:   Rolling
Template:
  Selector: app=sample-app,deploymentconfig=sample-app
  Replicas: 1
  Containers:
  NAME      IMAGE           ENV
  sample-app    sample-app:latest
mfojtik commented 8 years ago

OK, this is strange but when I deleted that project and created it from scratch, it actually deployed automatically after build even that auto=false...

EDIT: After I removed the project again and re-created it from the original steps, I get the IST not resolved again...

mfojtik commented 8 years ago

So when I set automatic:false it will not resolve the imagestreamtag... That seems to be the problem here.

mfojtik commented 8 years ago

So further investigation reveals that we never update the DeploymentConfig when the "automatic:false", which means the Image in ContainerSpec is never resolved into DockerPullSpec based on the ImageStreamTag.

The code is here: https://github.com/openshift/origin/blob/master/pkg/deploy/controller/imagechange/controller.go#L44

mfojtik commented 8 years ago

@ironcladlou FYI

0xmichalis commented 8 years ago

First bug in here is in oc status. Before I manually trigger the deployment:

$ oc status
In project stage on server https://localhost:8443

svc/sample-app - 172.30.154.201:8080
  dc/sample-app deploys istag/sample-app:latest <-
    bc/sample-app builds https://github.com/mfojtik/sample-app.git with openshift/ruby:latest 
      #1 build succeeded 45 seconds ago - 85d983b: Cleanup (Michal Fojtik <mfojtik@redhat.com>)
    #1 deployment waiting on image or update

View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.

It should warn me that the image is available and I just need to manually trigger since dc.spec.triggers[1].automatic == false.

0xmichalis commented 8 years ago

So when I set automatic:false it will not resolve the imagestreamtag... That seems to be the problem here.

I see the istag resolved from sample-app:latest to the full spec in the registry

$ oc describe istag sample-app:latest
Name:       sha256:21cce35a1e1974b0d3fe78d347b1b8e0f6e295f62b8fc9a28f047df31c21d123
Created:    29 minutes ago
Labels:     <none>
Annotations:    openshift.io/image.managed=true
Docker Image:   172.30.59.130:5000/stage/sample-app@sha256:21cce35a1e1974b0d3fe78d347b1b8e0f6e295f62b8fc9a28f047df31c21d123
$ oc describe dc/sample-app
Name:       sample-app
Created:    30 minutes ago
Labels:     app=sample-app
Annotations:    openshift.io/generated-by=OpenShiftNewApp
Latest Version: 2
Triggers:   Config, Image(sample-app@latest, auto=false)
Strategy:   Rolling
Template:
  Selector: app=sample-app,deploymentconfig=sample-app
  Replicas: 1
  Containers:
  NAME      IMAGE                                                       ENV
  sample-app    172.30.59.130:5000/stage/sample-app@sha256:21cce35a1e1974b0d3fe78d347b1b8e0f6e295f62b8fc9a28f047df31c21d123 
0xmichalis commented 8 years ago

@mfojtik you also have the following streams, right?

$ oc create -f https://raw.githubusercontent.com/openshift/origin/master/examples/image-streams/image-streams-centos7.json -n openshift
mfojtik commented 8 years ago

yes i do On Feb 1, 2016 4:09 PM, "Michail Kargakis" notifications@github.com wrote:

@mfojtik https://github.com/mfojtik you also have the following stream, right?

$ oc create -f https://raw.githubusercontent.com/openshift/origin/master/examples/image-streams/image-streams-centos7.json -n openshift

— Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/6934#issuecomment-178008345.

soltysh commented 8 years ago

So when I set automatic:false it will not resolve the imagestreamtag... That seems to be the problem here.

That's exactly the root cause of your problem. Only ImageChange trigger is capable of resolving IST correctly, since in your example it's turned off the Deployment will be created from the exact PodTemplate you specified, in your case: image: sample-app:latest which won't work, obviously. This is not a bug.

mfojtik commented 8 years ago

@soltysh it depends on how you define "bug" here. If automatic=false disable IST resolving then we should at least tweak its API description:

Automatic means that the detection of a new tag value should result in a new deployment.

What I expected here is that the IST gets resolved, because I have ICT defined and I want it to become the Container->Image. What I don't want is the new tag result into a new deployment, which is what that description is saying. So I can do manual deploy when I decide to do it.

cc @bparees @smarterclayton

0xmichalis commented 8 years ago

Reproduced. I agree with Michal, I would expect the tag to be resolved (I have explicitly set an ICT) so that I can manually kick a deployment. Since the behavior is already expected I wouldn't call this a bug and would drop priority (if we agree to fix it). @ironcladlou @smarterclayton thoughts?

mfojtik commented 8 years ago

@kargakis updated.

smarterclayton commented 8 years ago

Yeah, I think this is intentional and we should clarify the language. A user can always craft a deployment config that references a real image for the first deployment (with automatic: false) if they so desire.

mfojtik commented 8 years ago

@smarterclayton @ironcladlou more thoughts about this... The problem here is that we can't resolve the IST in PodTemplate, because we don't control it. Problem is, that if you have DC and you don't want to automatically create new deployment every time the IST is updated (you want to drive this manually), you have no way to specify the IS inside PodTemplate.

So my question is (after some discussion with @soltysh) can't we resolve the IST upon deployment? We can have a "mapping" field on DC which will then be passed to deployer. Once we start deploy, deployer will resolve the IST based on the mapping and use it for the Container image. If you don't have mapping, we use the original image you specified.

This will allow us to resolve IST every time we do deploy, which allows you to change the registry host and just redeploy. It also allows to have DC without ICT and still be able to benefit from ImageStream.

soltysh commented 8 years ago

ImageStream resolving should be in sync with ImagePullPolicy for the user to have similar experience.

smarterclayton commented 8 years ago

I would expect the deployer to resolve the IST onto the deployment config prior to creating the RC. I'm ok with updating the DC with that value every time - it makes export correct, which I like. It means that triggers are just mutating the DC, which I like. The only gap today is removing the generators, which we wanted to do anyway.

On Tue, Feb 2, 2016 at 12:29 PM, Maciej Szulik notifications@github.com wrote:

ImageStream resolving should be in sync with ImagePullPolicy for the user to have similar experience.

— Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/6934#issuecomment-178701795.

mfojtik commented 8 years ago

+1 @smarterclayton

0xmichalis commented 8 years ago

@smarterclayton fyi, upstream both generators (rollback and config) are part of the controller.

0xmichalis commented 8 years ago

More people are hitting this

smarterclayton commented 8 years ago

Once we start deploy, deployer will resolve the IST based on the mapping and use it for the Container image. If you don't have mapping, we use the original image you specified.

By deploy, we should really mean "at the time the trigger fires it should rewrite the container fields". Why aren't we doing that today?

ironcladlou commented 8 years ago

By deploy, we should really mean "at the time the trigger fires it should rewrite the container fields". Why aren't we doing that today?

That is what we do today. The image change controller resolves ISTs, rewrites the config spec, and bumps the version (if there was some change).

smarterclayton commented 8 years ago

When you have automatic=false, and someone wants to set an explicit image to deploy next, we can't rewrite the value. If a user takes control with automatic=false, we probably want to help them get "the latest images" via some sort of endpoint, but they have to assemble a full DC and send it to us, we can't do it automatically in the controller.

On Wed, Feb 17, 2016 at 1:37 PM, Dan Mace notifications@github.com wrote:

By deploy, we should really mean "at the time the trigger fires it should rewrite the container fields". Why aren't we doing that today?

That is what we do today. The image change controller resolves ISTs, rewrites the config spec, and bumps the version (if there was some change).

— Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/6934#issuecomment-185341144.

ironcladlou commented 8 years ago

To synthesize all I've seen here so far, what about something like:

type DeploymentConfigSpec struct {
  // <other fields omitted>
  ContainerImageMappings []ContainerImageMapping
}

type ContainerImageMapping struct {
  Automatic bool
  From ObjectReference // an ImageStream
  ContainerNames []string
}

// DeploymentTriggerImageChangeParams is no longer needed

This decouples rewriting from triggers, allowing ImageStream mappings to be applied on every deployment (or not, depending on the mapping's Automatic flag). The ImageChange trigger would no longer have any configuration: when an ImageStream is updated, if a DeploymentConfig has an ImageChange trigger, re-apply any matching mappings (regardless of the mapping's Automatic setting). If the newly processed spec has changed, deploy. If you don't want triggering, delete the trigger (just like ConfigChange). This is no longer burdensome since the trigger has no configuration.

Thoughts?

bparees commented 8 years ago

What does ContainerImageMapping.Automatic signify in this case?

ironcladlou commented 8 years ago

What does ContainerImageMapping.Automatic signify in this case?

Whether the mappings will be resolved and applied when a new deployment is being created from the deploymentConfig (which could be in response to a manual version bump, or a spec change, etc).

ironcladlou commented 8 years ago

The image trigger controller would resolve mappings and apply them to the spec automatically, and then they (edit: might) be re-resolved and applied again when the deploymentConfig is processed for creating the deployment... if the ISTs changed again in between, they might get re-resolved and applied a second time (correctly, and again based on the container mapping automatic flag).

bparees commented 8 years ago

so in my head i'm thinking a few flows:

a) ICT fires, matching mapping gets updated/resolved. If DC is automatic, deployment happens. b) Deployment is manually requested. Mappings marked "automatic" get updated. deployment happens. c) Deployment is automatically caused (by what?). Mappings marked "automatic" get resolved/udpated. deployment happens.

In all cases the actual deployment uses the value defined in the mapping at that point.

I guess my only concern is with (a). In the case of Builds we ultimately separated the ICT from the Build. That is, any imagechange could trigger a build, it's just an event. ("image foo changed. build bar!"). I think maybe DC should be similar. the ICT trigger starts the deployment. The content of that deployment would then depend on the Mapping and whether the mapping was set to Automatic(re-resolve) or Manual(do not re-resolve).

The other question i'd have is what you do with an ImageStreamTag reference in a Mapping when Automatic is false? You have to resolve it at least once to get a pull spec....

smarterclayton commented 8 years ago

I don't see what the mappings get us over having a flag on the trigger that says whether clients should refresh it when triggering a manual deployment. Manual deployments are manual.

On Wed, Feb 17, 2016 at 3:19 PM, Ben Parees notifications@github.com wrote:

so in my head i'm thinking a few flows:

a) ICT fires, matching mapping gets updated/resolved. If DC is automatic, deployment happens. b) Deployment is manually requested. Mappings marked "automatic" get updated. deployment happens. c) Deployment is automatically caused (by what?). Mappings marked "automatic" get resolved/udpated. deployment happens.

In all cases the actual deployment uses the value defined in the mapping at that point.

I guess my only concern is with (a). In the case of Builds we ultimately separated the ICT from the Build. That is, any imagechange could trigger a build, it's just an event. ("image foo changed. build bar!"). I think maybe DC should be similar. the ICT trigger starts the deployment. The content of that deployment would then depend on the Mapping and whether the mapping was set to Automatic(re-resolve) or Manual(do not re-resolve).

The other question i'd have is what you do with an ImageStreamTag reference in a Mapping when Automatic is false? You have to resolve it at least once to get a pull spec....

— Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/6934#issuecomment-185385913.

bparees commented 8 years ago

I could live with either, but abusing imagechangetriggers to say "this triggers deploys...oh and also we'll use the information in here to resolve which image to use when you do a manual deployment, if you set this other flag" feels like a hack.

imagechangetriggers should trigger things, overloading them to configure other behavior seems wrong.

bparees commented 8 years ago

and what if i always want to resolve an ImageStreamTag but i don't want to have an ICT at all? I define an ICT, then set deployment to manual, then set this other new flag to say "ResolveViaICT on deploy"?

do i still get configchangetriggers if i set deployment mode to manual?

smarterclayton commented 8 years ago

Triggers are just probably poorly named. Triggers are "Dependencies" that are leveraged by the controller to assist users in making decisions. There is "automatic" (always trigger), "refresh" (if something else changes and doesn't specify, refresh this value via a client), and "manual" (i don't need your help).

On Wed, Feb 17, 2016 at 3:38 PM, Ben Parees notifications@github.com wrote:

and what if i always want to resolve an ImageStreamTag but i don't want to have an ICT at all? I define an ICT, then set deployment to manual, then set this other new flag to say "ResolveViaICT on deploy"?

do i still get configchangetriggers if i set deployment mode to manual?

— Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/6934#issuecomment-185393366.

bparees commented 8 years ago

@smarterclayton ok, i'll buy that.

0xmichalis commented 8 years ago

@ironcladlou do we really need new API for this? How about having the imagechange controller update the reference but not bump the version for triggering the new deployment? False for automatic is a variation of Paused. If a deployment is paused it won't trigger by updating its container.

ironcladlou commented 8 years ago

@ironcladlou do we really need new API for this? How about having the imagechange controller update the reference but not bump the version for triggering the new deployment? False for automatic is a variation of Paused. If a deployment is paused it won't trigger by updating its container.

How are references in the deploymentConfig pod spec updated during a manual deployment?

smarterclayton commented 8 years ago

So these three cases need to work:

  1. A user can tag an image stream tag and the deployment is updated with the latest values of any automatic tags (handled by deployment controller)
  2. A user on the command line can manually update to a particular latest manual image and leave the others alone (handled by client code today, but this is arguably an /instantiate endpoint)
  3. A user on the command line can easily update to the latest images for each container and then post that to the deployment config (no one does this, but this is arguably an /instantiate endpoint)

I would expect oc rollout --image to handle both 2 and 3 cleanly.

On Thu, Feb 18, 2016 at 1:08 PM, Dan Mace notifications@github.com wrote:

@ironcladlou https://github.com/ironcladlou do we really need new API for this? How about having the imagechange controller update the reference but not bump the version for triggering the new deployment? False for automatic is a variation of Paused. If a deployment is paused it won't trigger by updating its container.

How are references in the deploymentConfig pod spec updated during a manual deployment?

— Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/6934#issuecomment-185839557.

0xmichalis commented 8 years ago

How are references in the deploymentConfig pod spec updated during a manual deployment?

Same as automatic DCs are updated now, the only difference will be that lateVersion stays the same (so no new deployment runs).

0xmichalis commented 8 years ago

@ironcladlou got this working locally, will submit a poc PR shortly.