deis / controller

Deis Workflow Controller (API)
https://deis.com
MIT License
41 stars 53 forks source link

Deploying from outside the app's repo with deis pull -a xxx destroys custom process types #1062

Closed felipejfc closed 7 years ago

felipejfc commented 7 years ago

I think that this is caused by deis not detecting the app's Procfile, right? But is deleting the custom process types really the desired behavior here? Shouldn't it be better to read the Procfile from inside the image being deployed for example? Or just leaving the processes there, not deleting them if Procfile is not found but continue adding them whenever there is a Procfile within the path...

helgi commented 7 years ago

Was the application a git push app originally (with a Procfile) and then you are doing a deis pull on top of that? Or was it always a deis pull app?

felipejfc commented 7 years ago

It has mixed deploy types, the first one was a git push, later, our Jenkins made a Deis Pull

kmala commented 7 years ago

@felipejfc you can specify a procfile with custom proctypes during a deis pull either through argument and keeping it in the directory of the pull.

deis pull --help
Creates a new build of an application. Imports an <image> and deploys it to Deis
as a new release. If a Procfile is present in the current directory, it will be used
as the default process types for this application.

Usage: deis builds:create <image> [options]

Arguments:
  <image>
    A fully-qualified docker image, either from Docker Hub (e.g. deis/example-go:latest)
    or from an in-house registry (e.g. myregistry.example.com:5000/example-go:latest).
    This image must include the tag.

Options:
  -a --app=<app>
    The uniquely identifiable name for the application.
  -p --procfile=<procfile>
    A YAML string used to supply a Procfile to the application.

Reading from inside the image is not going to be accepted as that would playing with the image which not many/any users want. If you want to always add or just leave the proctypes as it is, how can one delete existing proctypes.

felipejfc commented 7 years ago

@kmala, maybe with deis ps:delete proctype... I agree that getting from inside the Docker image is not a good option but deleting proctypes stealthily this way can cause confusion... It happened with a dev on my team this week.

helgi commented 7 years ago

@felipejfc Think the big issue here is mixing deployment types - Not something we'd recommend if you have multiple process types.

What happened in your scenario is how Deis has always worked. For deis pull you have to provide the yaml string of the Procfile to keep the behaviour you are mentioning above.

felipejfc commented 7 years ago

Maybe just a warning that deploying will delete proctypes would be sufficient...

bacongobbler commented 7 years ago

I touched on the intended workflow of deis pull in https://github.com/deis/controller/issues/1094#issuecomment-250603486, so this is working as intended. However some documentation around this wouldn't hurt.

felipejfc commented 7 years ago

Today my team faced the same issue of one people deploying with deis pull outside of the app's folder and because of that we were running our app without one of its components for a week... I do think this is a serious bug and we should at least warn the user that process types will be deleted when one pushes without being in Procfile's folder or specifying one.

helgi commented 7 years ago

We may want to consider preventing a deis pull or git push when the app has been deployed one way or the other first. It is rather artificial but could do the trick. We'd have to keep track of how the build was started

In the past we could have looked at the Push model but that got removed... hmm - @bacongobbler do you think that'd work out in addition to docs so @felipejfc and others don't have the accidental scenario happening still

kmala commented 7 years ago

@helgi i don't think it help in this case as i assume from @felipejfc comments that they are doing a deis pull all the times but sometimes within the app folder which has procfile and sometimes from another folder which doesn't have procfile

helgi commented 7 years ago

Yeah, however in @felipejfc earlier comments he said the (original) app was started with git push and then Jenkins did pull. In that scenario we'd restrict the app to git push only - It's not a clean solution tho :(

felipejfc commented 7 years ago

I also don't like this solution (restrict the app to git push only)... I don't think the problem here is with mixed deployments (git push or deis pull), what happened like 3 times already was different people deploying without being in app's folder and using deis pull... then some of the process types are deleted by deis. I think the problem here is that one assume the deployment succeded because of deis telling so but some process types was deleted in background without even warning the user... I don't think docs would solve this either :/

helgi commented 7 years ago

@felipejfc I understand why you are frustrated by this but that's how Deis has always worked, and I believe Heroku. With documentation.

What we are coming up against here is the notion that the Procfile can add/update/remove process oppose to only do an add/update operation.

To prevent your scenario then we'd have to change the whole system so that when a user wants to remove a process type they have to do a deis scale --type foo 0 before/after their deploy, regardless of them describing the intent by removing the foo process from the Procfile. @felipejfc I would be curious how you would want that scenario to work yet still support people accidentally leaving out the -p on deis pull or not being in the right directory when doing the deploy?

That's a rather large behavioural change in Deis - That's why I am trying to propose alternatives.

felipejfc commented 7 years ago

@helgi thanks for the attention :) Can't we just warn the user? like... Hey, you are using deis pull without specifying a Procfile and Deis could not detect any in the current path, are you sure you want to proceed? This may delete some of your custom process types... [y/n]

This message may only be placed when deis detects that the user has more process types than just cmd for example...

helgi commented 7 years ago

As far as the CLI goes, I personally never use a Procfile with Deis - That question would drive me nuts :) Could potentially look and see if we have extra process types, as you said, beside cmd or web and go from there - Means the CLI has to do a double query.

The communication between the API and the CLI is not that fluid during an operation such as Deploy. The API will return a 200 and the CLI will go "okay, done!" - So far we have not had a scenario of doing a 200 with a custom message. We could reject the deploy on the basis of removal but that's again changing behaviour.

I only mention the above as an illustration that if the CLI is doing such double checking then that's a CLI feature - Not something we have available as a "oh no!" kind of warning coming from the API directly. Other SDKs would run up against this. And that might be just fine?

felipejfc commented 7 years ago

I understand and I do think that making it a CLI feature is not the ideal solution here, but I do think it is better than nothing, right? I really don't see any other options besides making a double query in the CLI or rejecting the deploy in API level. But one thing I do know, only making some docs on this will not prevent 100% of the people from making this same mistake, me and 3 others of my team already did it because some of our deployments are automatic (made by Jenkins) and some others are manual... the manual ones are error prone. From our experience, you only learn about this issue when it harms you.

heynemann commented 7 years ago

Hi @helgi,

First of all, I'd like to thank the whole team behind Deis. We are enjoying it tremendously and it's helping make us more agile.

That said, I'm very concerned about the current behavior in Deis. I wholeheartedly disagree with the statement that removing the line from the Procfile and not having a Procfile at all are the same scenario.

I think the problem boils down to three different scenarios:

The point we have to make here is that all the above scenarios perform destructive operations (unless the procfile only did have one command). In my experience, the old saying pays every time: it's better to be safe than sorry.

In order for Deis to be more reliable, I believe it just needs to be more straightforward with the users when it does not know what to do. If a deploy was made with a Procfile and the next one is not, then just let the user know: No Procfile was found and the API reports that the previous deployment was made with a Procfile. Please use --no-procfile if you removed it.

This way, if it the user really intended to remove the Procfile, they will just retry with --no-procfile and from that point forward no message will be given (after all, there will not be a Procfile in the previous deployment, right?).

Don't get me wrong, I'm all for docs explaining what the system does. I try to get the most complete docs I can in every open source project I maintain. The point here is that every system should be secure by default and be made less secure intentionally by users (AWS is a good example). Right now, the behavior Deis is exhibiting is not the most secure possible.

If you do need help with this issue, let us know what we can do.

bacongobbler commented 7 years ago

Right now, the behavior Deis is exhibiting is not the most secure possible.

I'm not sure how this has to do with security, but I see your point.

So to summarize your concerns, if one deploys an app via git push with a web and a worker process type, then another user deploys to the same application another image using deis pull without any process types, the client should pull down the current app structure, verify the current structure against the proposed structure, display a warning that the process types differ and ask for confirmation. The validation (read: the second API request) and prompt can be overridden with a --yes or -y flag. Does that sound fair to you?

bacongobbler commented 7 years ago

In order for Deis to be more reliable, I believe it just needs to be more straightforward with the users when it does not know what to do. If a deploy was made with a Procfile and the next one is not, then just let the user know: No Procfile was found and the API reports that the previous deployment was made with a Procfile. Please use --no-procfile if you removed it.

This sounds like a good idea at first, but we cannot do this in a backwards-compatible fashion. We cannot refuse the incoming request with No Procfile was found and the API reports ... because we allow users to deploy images via git push with differing process types between releases and this would break that precedent. However, making a client-side validation for users using both git push and deis pull sounds like a reasonable compromise.

@helgi has also mentioned his reservations against API-level changes (and I agree), but a client-side validation that does a double query for a deis pull sounds more reasonable in the meantime. That way non-client workflows still work the same and still require a single request to the controller.

If you're willing to take a crack at a PR, feel free to write one up!

heynemann commented 7 years ago

It does sound fair enough as it alleviates our scenario.

When I say it's not the most secure possible I do not mean as in someone can take control of my app or something like that. I mean secure as in safe. It means that I will not take down a portion of my app by misunderstanding of Deis internals.

I can definitely see the flow of git pushing a Procfile with additional targets. What I meant is that a confirmation should be required whenever a destructive action is detected (removing a target). The issue arises whenever the user is pushing a Procfile without a specific target. I do not have a good answer here, I think. I still feel the safer side is best, since a destructive operation might be very bad in specific scenarios.

What do you say of a specific configuration in Deis that disallows destructive actions without confirmation? Then the admin of the Deis farm can set that flag to true if they so wish. This way, Deis would be backwards compatible (flag defaults to false) and still enable us to "harden" it a little bit.

I'm proposing this, because I feel if this is not at the API level, it might be abused anyway.

If that is something acceptable, I can definitely take a crack at it.

helgi commented 7 years ago

One solution would be to add a --keep-proc-types (name TBD) and it would allow your use case as long as you guys remembered to add that flag... That's the best compromise I can come up with that involves the API in the solution.

That way you'd simply have to use the flag or use a deis pull --keep-proc-types alias for everyone. I haven't investigated yet about being able to use the deis CLI config file to set defaults like that... We could potentially take it as far as people setting the behaviour in .deis/config or similar but not promises on that one until I've looked into it further.

I do realise that by default this still keeps the less forgiving and more destructive behaviour.

Thoughts @felipejfc @heynemann ? This is a slightly different thing than what @heynemann just proposed (I still think that'd be a worthwhile second proposal overall - Worth opening up a proposal ticket?)

heynemann commented 7 years ago

Yep, what I meant is that if it's not something that we can turn on (or off depending on your perspective) from the admin side of things, we'd still have issues with it.

Right now we are in the process of rolling out Deis for something like 50 devs. It's going to make it harder and more error prone if we have to remind everyone to always use a flag when doing deis pull.

Don't get me wrong, it is doable and we'll do it. I just feel that we have an opportunity here to improve the platform. If the team maintaining Deis disagrees, I'll respect your opinion and we can close this ticket.

If you feel I should open a new issue with the proposal for the API changes, just let me know and I'll try to come up with something that works for everyone while still being backwards-compatible.

Thanks for your time in replying here.

felipejfc commented 7 years ago

I do realise that by default this still keeps the less forgiving and more destructive behaviour.

Hi @helgi , I agree with you and I also agree with what @heynemann said before:

The point here is that every system should be secure by default and be made less secure intentionally by users (AWS is a good example)

And I think that just adding a non-default flag will not make it "secure" by default...

So to summarize your concerns, if one deploys an app via git push with a web and a worker process type, then another user deploys to the same application another image using deis pull without any process types, the client should pull down the current app structure, verify the current structure against the proposed structure, display a warning that the process types differ and ask for confirmation. The validation (read: the second API request) and prompt can be overridden with a --yes or -y flag. Does that sound fair to you?

Yes, this sounds very fair!

helgi commented 7 years ago

@heynemann A proposal around the destructive things would be fine - We do ask people (on the CLI) when destroying resources (app, certs, domains, etc) for confirmation already but that's not something enforced in the API as needing a confirmation.

If you want to try to find a generally backwards compatible way to achieve non-destructive process type actions without --keep-proc-typesflag or a configuration for the CLI (as in neither is enforced in the API)

Just so we are on the same page, we are being cautious since the behaviour that is biting you is a behaviour other users may depend on, so we are trying to keep things backwards compatible during the 2.x cycle - mixing git push and deis pull is rather unique in this all :)

felipejfc commented 7 years ago

@helgi you do not need to mix git push with deis pull to hit the issue, I think that if my first deploy is from inside the project folder with a deis pull (deis-cli detects the procfile) and a second deploy is made also with deis pull but from outside the project folder I would still hit this scenario (deleted process types), right?

bacongobbler commented 7 years ago

mixing git push and deis pull is rather unique in this all :)

To add onto this, I think the question on our minds is how come you are using two separate workflows for the same application. That was never the original intention of providing multiple ways to deploy an app. Either

  1. you deploy using a buildpack
  2. you deploy using a dockerfile
  3. you deploy a docker image

Intermingling the different workflows was never really intended as part of Workflow. Originally, we only provided git push as the only workflow for deploying apps, then added Dockerfile functionality to the builder. deis pull was intended to be used for the CI/CD workflow. Each deployment type has their own unique workflow, either by using a Procfile in the app or by providing a list of process types in --procfile.

I think that if my first deploy is from inside the project folder with a deis pull (deis-cli detects the procfile) and a second deploy is made also with deis pull but from outside the project folder I would still hit this scenario (deleted process types), right?

If reading from a local Procfile on local disk via deis pull is the issue you're facing, then perhaps we should deprecate that functionality altogether and require users to use deis pull --procfile $(cat Procfile) deis/example-go instead.

felipejfc commented 7 years ago

If reading from a local Procfile on local disk via deis pull is the issue you're facing, then perhaps we should deprecate that functionality altogether and require users to use deis pull --procfile $(cat Procfile) deis/example-go instead.

The issue we are facing is that when the cli does not detect the procfile it defaults to deleting custom process types... This may help to solve the problem but also makes harder to deploy with deis pull in the cases that I do not need a Procfile at all (just letting cmd run the default image cmd is enough)... What do you think about that @bacongobbler?

heynemann commented 7 years ago

@bacongobbler I'm all for requiring a --procfile if the previous deployment was made with a Procfile. That would prevent our issue altogether.

Just to clarify what exactly is the scenario that is biting us.

1) Developer A wants to deploy his app to Deis. As we use containers for all our apps, he deploys with deis pull. Dev A deploys in the folder that contains the Procfile (or uses --procfile) and we get both worker and web targets in Deis. 2) After some time, Developer B want to deploy a new version of the app. She makes her changes, waits for the CI to finish building the docker image in our private repository. Once it is done and deployed to staging, she gets the new version and decides to do a deis pull to production with that version. As it is already working in staging, she just fires a new terminal window and does a deis pull. 3) Deis silently removes both targets web and worker and replaces them with cmd that defaults to whatever entrypoint the docker image has. After the deployment is done, it tells her that everything is fine.

Just by reviewing this workflow in my head I can see that we could easily improve a lot (even if we do not fix the current behavior) just by returning the new targets with their allocated units:

Creating build... ...o  (cropped output for clarity)
done

Targets before this version:
* web (2 units)
* worker (2 units)

Targets after this version:
* cmd (1 unit)

What do you guys think about this? We could even give the same output in the CLI and confirm:

Targets before this version:
* web (2 units)
* worker (2 units)

Targets after this version:
* cmd (1 unit)

The targets differ between version 5 and version 4. Are you sure you want to continue? y/n
bacongobbler commented 7 years ago

The problem with that approach returns us to the double query problem. That is, we need to first fetch the app's procfile structure, compare it to what's proposed then push the new build after confirmation. For properly formed deis pull commands (i.e. one that supply --procfile, have a Procfile in the local directory or don't supply a Procfile at all) that's one request too many compared to the current workflow. Supplying a --yes could circumvent this problem but it's still an optimization issue.

Just by reviewing this workflow in my head I can see that we could easily improve a lot (even if we do not fix the current behavior) just by returning the new targets with their allocated units:

Sure, we could display the application's process types after a push. Would that work for you?

bacongobbler commented 7 years ago

The reason we are against most of these changes are because the main reason this came up is due to user error. While it is true that using deis pull outside of the intended directory is a destructive action (and I understand the frustration), all of the solutions provided so far requires changes or sub-optimal solutions to users that do not have this problem because either

a) they explicitly use --procfile every deployment, which then they don't rely on being in the right subdirectory b) CI deploys the image for them in the same workspace every time, eliminating the user error problem c) they use a different workflow that embeds the Procfile as part of their workflow

In other words, any solution we write is going to be a win for you guys (no more destructive actions), but it's a loss for everyone else (extra API requests, breaking changes requiring confirmation before deploying, etc). That's why we're more prone to lean on documentation or suggest a different workflow to your org for this issue.

heynemann commented 7 years ago

I agree 100% @bacongobbler and that's why I want to provide the strict mode as a configuration that the Deis farm admin can enable. We'd be using it to ensure that people can't perform destructive actions without confirmation. Deis default could be the way it works right now.

If that's not something you see going forward, I would be very happy if the API returned the targets before and after the deployment so that the CLI could tell that to the user. Highlighting that the targets changed should be enough to tell the user that something's amiss (unless they expect the targets to change).

What would be a workflow that would work with containers? What this means is that we want the exact same container image that's running on staging to be deployed to production. If you guys could shed some light on that we'd appreciate it immensely.

What do you think?

bacongobbler commented 7 years ago

What would be a workflow that would work with containers? What this means is that we want the exact same container image that's running on staging to be deployed to production.

Assuming all your images are on a private registry after your changes are merged into HEAD and you have CI infrastructure set up, you can maintain two separate apps called myapp and myapp-staging. myapp-staging is then a direct mirror of whatever's in HEAD via CI running deis pull at the end of the job that's run when merged into master. Once you're happy with what's in HEAD, you can run a job that pulls that image into production.

CI isn't necessary here, but it helps with the automation from merging the commit(s) into HEAD to your private registry, and finally your staging environment. For an example, you may want to take a look at our own Jenkins job flow. We're not deploying to a Workflow cluster (though we have our own clusters for in-house apps like Slack/Github bots), but we are pushing our staging and release bits to quay.io.

I agree 100% @bacongobbler and that's why I want to provide the strict mode as a configuration that the Deis farm admin can enable. We'd be using it to ensure that people can't perform destructive actions without confirmation. Deis default could be the way it works right now.

As long as we get it documented on how it works, where it's useful and writing the necessary tests I'm all for a PR for an opt-in server-side feature flag.

If that's not something you see going forward, I would be very happy if the API returned the targets before and after the deployment so that the CLI could tell that to the user. Highlighting that the targets changed should be enough to tell the user that something's amiss (unless they expect the targets to change).

This would also be an acceptable PR.

helgi commented 7 years ago

Please see https://github.com/deis/controller/pull/1099 - It is only a proposed solution or at least a partial solution. It only tackles the notion of giving Deis operators (admins) and the developers the ability to turn off the removal of processes when Procfile is missing - It does not try to cover other (potentially) destructive behaviours and so on.

@heynemann This could potentially serve as the basis of future work you may want to do?

helgi commented 7 years ago

I'm going to call this closed as #1099 got merged - Anyone involved here, please reopen this ticket if we didn't cover concerns