kubernetes / enhancements

Enhancements tracking repo for Kubernetes
Apache License 2.0
3.38k stars 1.45k forks source link

Sidecar Containers #753

Open Joseph-Irving opened 5 years ago

Joseph-Irving commented 5 years ago

Enhancement Description

/sig node

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

mrbobbytables commented 4 years ago

Great! Thanks @Joseph-Irving I'll add the info to the tracking sheet 👍

VineethReddy02 commented 4 years ago

@Joseph-Irving

I'm one of the v1.17 docs shadows. Does this enhancement (or the work planned for v1.17) require any new docs (or modifications to existing docs)? If not, can you please update the 1.17 Enhancement Tracker Sheet (or let me know and I’ll do so)

If so, just a friendly reminder we're looking for a PR against k/website (branch dev-1.17) due by Friday, Nov 8th, it can just be a placeholder PR at this time. Let me know if you have any questions!

Thanks!

Joseph-Irving commented 4 years ago

Hi @VineethReddy02 yeah this will require documentation, placeholder PR is here https://github.com/kubernetes/website/pull/17190

Joseph-Irving commented 4 years ago

I've raised a PR to update the KEP https://github.com/kubernetes/enhancements/pull/1344 it's based on some discussion we were having in the implementation PR https://github.com/kubernetes/kubernetes/pull/80744.

I'd appreciate any comments

annajung commented 4 years ago

Hey @Joseph-Irving 1.17 Enhancement Shadow here! 👋 I am reaching out to check in with you to see how this enhancement is going.

The Enhancement team is currently tracking kubernetes/kubernetes#79649 and kubernetes/kubernetes#80744 in the tracking sheet. Are there any other k/k PRs that need to be tracked as well?

Also, another friendly reminder that we're quickly approaching code freeze (Nov. 14th).

Joseph-Irving commented 4 years ago

Hi @annajung, yep those are the only PRs that need tracking.

annajung commented 4 years ago

Hi @Joseph-Irving , Tomorrow is code freeze for 1.17 release cycle. It looks like the k/k PRs have not been merged. We’re flagging this enhancement as At Risk in the 1.17 tracking sheet.

Do you think all necessary PRs will be merged by the EoD of the 14th (Thursday)? After that, only release-blocking issues and PRs will be allowed in the milestone with an exception.

Joseph-Irving commented 4 years ago

Hi @annajung, unfortunately it is looking very unlikely that they are going to be merged before code-freeze. We've made a lot of progress this release cycle so hopefully we can merge them early into 1.18.

annajung commented 4 years ago

Hey @Joseph-Irving Thank you for the update. I'll update the milestone accordingly and mark this enhancement as deferred to 1.18. Thank you!

/milestone v1.18

jeremyrickard commented 4 years ago

Hey @Joseph-Irving. 1.18 Enhancements Lead here 👋 .

The 1.18 Release started yesterday, so I'm reaching out to see if you plan on landing this in the 1.18 release? I think this missed 1.17 because of code freeze. How are things looking for 1.18? I see the PRs are still open.

Thanks!

Joseph-Irving commented 4 years ago

Hi @jeremyrickard,

Yeah the plan is to get this in the 1.18 release.

The API PR (https://github.com/kubernetes/kubernetes/pull/79649) got a review from thockin, the other day, he had a few points but once those are addressed we will close that PR and incorporate the commits into (https://github.com/kubernetes/kubernetes/pull/80744) so we can merge the API and implementation together.

As for the Kubelet PR (https://github.com/kubernetes/kubernetes/pull/80744) it just needs reviewing, I'm hoping we can get some sig-node bandwidth to review it this cycle.

jeremyrickard commented 4 years ago

Thanks for the update @Joseph-Irving

Added it into the tracking sheet!

rubenhak commented 4 years ago

Sorry for being late to the party. This is a significant improvement for common cases. But does not seem to cover a more advanced case.

Consider a case of sidecar which exports logs which also depends on Istio sidecar. If Istio sidecar shuts down first, some sensitive logs might not get exported.

A more generic approach would be to define explicit dependencies across containers. Regardless if they are sidecars or not.

What do you think of such api definition instead:

kind: Pod
spec:
  containers:
  - name: myapp
    dependsOn: ["exporter", "istio"]
  - name: exporter
    dependsOn: ["istio"]
  - name: istio
kfox1111 commented 4 years ago

@rubenhak that gets messy really quick. What needs to be satisfied for the dependency to be clear to proceed? There is often a gap between started and ready that I think dependsOn would really care about that that api does not address.

rubenhak commented 4 years ago

@kfox1111 How does the proposed design determine that the sidecar is started and ready in order to launch the main container? The only difference I propose is that instead of marking containers as "sidecar" is to use a more generic way of defining dependency.

I don't think dependsOn should specify criteria. It could be specified in the dependent container. Wouldn't readinessProbe and/or livenessProbe be sufficient? If not, there can be a startupProbe - success of which indicate that dependent containers can be started.

irvifa commented 4 years ago

Hello @Joseph-Irving I'm one of the v1.18 docs shadows. Does this enhancement for (or the work planned for v1.18) require any new docs (or modifications to existing docs)? If not, can you please update the 1.18 Enhancement Tracker Sheet (or let me know and I'll do so)

If so, just a friendly reminder we're looking for a PR against k/website (branch dev-1.18) due by Friday, Feb 28th., it can just be a placeholder PR at this time. Let me know if you have any questions!

Joseph-Irving commented 4 years ago

Hi @irvifa I've raised a placeholder PR here https://github.com/kubernetes/website/pull/19046

irvifa commented 4 years ago

Hi @Joseph-Irving Thank you for your swift response!

rgulewich commented 4 years ago

@rubenhak - I agree with @kfox1111 that having a full graph of dependencies can get pretty messy pretty quickly. What about starting the containers in the order in the pod spec, and then tearing them down in the reverse order (like a stack)? This would be a lot simpler to implement, and covers most of the common ordering use-cases that I can think of.

rubenhak commented 4 years ago

@rgulewich, could you elaborate bit more, what exactly can get messy? Deriving order from graph is a trivial task, especially considering the fact that no sane operator would run more than 15 sidecars (already a stretch).

Idea of order is ok, but since most of sidecars are injected using admission controllers it would be really hard to guarantee correctness of order. There is a need for indirection.

bjhaid commented 4 years ago

@rubenhak there could be a cycle in the dependency order of the containers, how does k8s/kubelet decide to break the cycle and decide what order to start/stop the containers? Thinking out loud maybe this could be an API side validation.

jeremyrickard commented 4 years ago

Hey @Joseph-Irving,

Just a friendly reminder that code freeze for 1.18 is March 05, 2020.

As we track toward code freeze, please list out/link to any PRs you are working on toward graduating this enhancement!

Joseph-Irving commented 4 years ago

Hey @jeremyrickard,

Is the pr to track https://github.com/kubernetes/kubernetes/pull/80744 This PR contains API changes but will have the commits merged into the PR above once review has finished https://github.com/kubernetes/kubernetes/pull/79649

rubenhak commented 4 years ago

@rubenhak there could be a cycle in the dependency order of the containers, how does k8s/kubelet decide to break the cycle and decide what order to start/stop the containers? Thinking out loud maybe this could be an API side validation.

@bjhaid, API side can do the validation. Loop detection is a trivial algorithm with a linear time complexity (just like DFS traversal).

There might also be a need to rerun validation after sidecar injection as well.

kfox1111 commented 4 years ago

I've been thinking about this for a while... Most of the issues with dependencies really boil down I think to service meshes. (maybe someone can think of another example though)

The service mesh proxy is a sidecar that needs to start and become ready before anything else, and needs to exit after anything else. They are long running so are more of a sidecar then an init container.

But initContainers ideally should be all able to use the service mesh too.

But initContainers may need to init other sidecar containers.

While we could design some kind of intricate dependency system involving init containers, sidecar containers, and regular containers, maybe we should just have two classes of sidecars? regular sidecars, and network sidecars?

network sidecars must all become ready at the very beginning. service mesh proxies go here. init containers run next in order. sidecars all start and become ready. This can include stuff like auth proxies, loggers, etc. regular containers start and become ready.

tear down is in reverse.

Would this eliminate the dependency issue while still solving all the issues service meshes seem to have with container ordering? I'm thinking so?

rubenhak commented 4 years ago

@kfox1111, Vault now does secret injection using sidecars. Which class should it fit into? Also, depending on the case, vault could depend on service mesh, or the other way around.

All I'm saying is that such design would eventually explode into 10 sidecar classes. Such approach implies even stronger opinion on how things should run. People would start hacking with classes, just to achieve the order require to launch the application.

If the only purpose of those classes is to define the order, why not to do that explicitly?

Answering your question, while such design would cover some use cases, it doesn't help with other cases like vault sidecars, logging sidecars, etc. This already a proposal for redesign of original feature. Since this is a second attempt, its worth to make it right this time.

I don't how dependencies are intricate. Could you elaborate more on this? Dependencies make YAML definitions more obvious, there is no hidden logic. Approach with hardcoded classes would require hidden logic and a lot more documentation explaining why networking sidecars should run after other sidecars, etc.

tedyu commented 4 years ago

What if we introduce a field into Container ?

    // +optional
    Priority int

This field is effective among containers of the same type (sidecar, normal). For sidecar containers, the sidecar container with higher priority would be instantiated first / torn down last.

rubenhak commented 4 years ago

@tedyu, dependency has much more metadata and value compared with "priority". It takes 30 lines of c++ code to produce priority order given dependency https://www.geeksforgeeks.org/find-the-ordering-of-tasks-from-given-dependencies/. The other way around is not possible.

Another benefit is that given dependency graph certain containers can be started at the same time. In the following example: "A -> B, B -> C, D -> C" containers B and D can be started at the same time if C is initialized. I'm not telling implementation has to support that, but at least it can be very valuable if the API allows such definition.

AndiDog commented 4 years ago

Integer priority won't work nicely – people will use all kind of different, non-standardized numbers as they do with the CSS z-index property (like -9999).

Joseph-Irving commented 4 years ago

@rubenhak What you're suggesting at this point is basically an entirely different feature to the one being describe in this KEP, it's not a small tweak, it's a total rewrite. It would require getting everyone who had previously agreed to this feature (it took a year to get this approved by all parties) to revaluate this. If you are passionate about such a feature I would suggest that you create your own KEP and take it to the various SIGs to get their feedback on it.

The idea of a dependency graph was discussed at length when this proposal started back in 2018, the conclusion was unanimous in that although it would enable some more use cases, they were not strong enough use cases and that the added complexity was not worth it.

I think you are somewhat underestimating the degree of change required to implement what you're describing. But if it is as a simple as you seem to think, you could create your own Proof of Concept of this working in Kubernetes and present that to the SIGs to help strengthen your case.

This KEP is not yet a GA feature, if your KEP gets approved and implemented then we can remove this one. They are not yet mutually exclusive.

howardjohn commented 4 years ago

This change may not be the perfect solution for every single use case, but it dramatically improves the experience for most and I think we would be much better off getting this merged in then to debate for another year about the implementation.

krancour commented 4 years ago

if your KEP gets approved and implemented then we can remove this one. They are not yet mutually exclusive.

Would they ever be mutually exclusive?

I'm asking myself if this feature has value even if more explicit container startup/shutdown ordering (which I think would be great) is enabled through another enhancement in the future... and I am thinking yes.

Any startup / shutdown order, that's implied by the classification of containers as init, sidecar, or "regular" set aside, these classifications also express other useful, and arguably unrelated aspects of a container's desired behavior, do they not?

For example, it's useful in a pod with restartPolicy != Always (a pod that implements a job, perhaps), that containers designated as sidecars have no bearing on a pod entering a completed state.

kfox1111 commented 4 years ago

@kfox1111, Vault now does secret injection using sidecars. Which class should it fit into? Also, depending on the case, vault could depend on service mesh, or the other way around.

We worked through csi ephemeral drivers so that things like vault would not need sidecars/init containers. I believe there is a vault driver in the works.

Though regular sidecar with emptyDir seems like it would fit for a sidecar that needs to use the network sidecars?

rubenhak commented 4 years ago

@Joseph-Irving, I by no means was trying to block this KEP from going in. I realize that you started it almost 2 years back and there are quite a lot of folks waiting for this to be released.

Do you have a link to prior discussion related to dependency graph?

jeremyrickard commented 4 years ago

Hey @Joseph-Irving,

Friendly reminder that we are closing in pretty quickly on code freeze on 05 March 2020. It does not look like your PRs have merged yet, are you still feeling like you're on track for code freeze for this enhancement?

Joseph-Irving commented 4 years ago

Hey @jeremyrickard, The API review (https://github.com/kubernetes/kubernetes/pull/79649) is basically done. We will be closing that PR and moving entirely into the implementation PR so it can all (API and Implementation) be merged in one PR.

The implementation PR (https://github.com/kubernetes/kubernetes/pull/80744) has been thoroughly reviewed, so I'm trying to get a sig-node approver to take a look for final approval.

Whether this happens in time for code freeze is somewhat hard for me to say, it's very dependent on whether I manage to get the attention of some approvers in time.

duderino commented 4 years ago

Would love to see this get in by code freeze. It would make Istio's solution to https://github.com/istio/istio/issues/7136 both simpler and better.

nfickas commented 4 years ago

Any movement on this getting in to 1.18? Seems necessary for istio sidecars to work as expected with fast running jobs.

Joseph-Irving commented 4 years ago

I've tried reaching out to the sig-node approvers in a variety of ways but I've not had any responses. So I'm not very optimistic that this will make it into 1.18.

rjbez17 commented 4 years ago

@Joseph-Irving the #pr-reviews slack channel was created for these cases. Have you tried that? It’s a way to get an escalation on pr reviews. (I didn’t see it in there)

jeremyrickard commented 4 years ago

Hey @Joseph-Irving ,

We're only a few days out from code freeze now. Do you want to defer this to 1.19 based on the reviewer bandwidth? Or try and make a push?

Joseph-Irving commented 4 years ago

Hey @jeremyrickard ,

No one has responded to me regarding getting these PRs merged in 1.18 so I highly doubt it will happen.

We can defer to 1.19 but I'm starting to wonder if there's any point in doing so. This KEP has been in flight for almost two years (original alpha target was 1.15), the PRs in question have been open for almost 1 year, there's never any "reviewer bandwidth" for them. I have politely emailed, slacked, gone to sig-meetings, and even found people in-person to get reviews yet we have made very little progress. Any reviews I have managed to get have only ever suggested minor changes, it's not like large rewrites have been requested, the PRs are all basically the same as they were a year ago just a bit more polished. I don't know if I'm meant to be aggressively pinging people every day until they respond to me but that's really not something I'm comfortable doing.

I think the problem is more that nobody actually cares about this feature, I am the only one driving this feature forward, nobody in the SIGs seem interested in seeing this through. If it takes two years to get in to alpha, how long will it take to get it to beta/GA? (as in when most people can actually use this)

Frustratingly there does seem to be interest from the wider community and end users in getting this feature in, the reason I did this in the first place is that I saw it was an issue, asking the SIGs if they were gonna fix it, and they said "we don't have the capacity but we'd be happy for you to do it". So I did everything they asked, I wrote the KEP, I got it approved by all parties, I wrote all the code, all the tests, constantly kept it up to date as each release passed, and yet here we are.

Every time we delay this, I feel like I'm letting a load of people down, is this just all my fault? is the code so terrible nobody will even comment? am I just not aggressive enough in trying to get attention? I just feel that I can't get this done on my own, and I'm getting a bit tired of beating this dead horse.

I'm sorry for the long rant (not directed at you personally Jeremy or anyone personally for that matter), but this has been slowly eating away at my soul for a long time now.

itskingori commented 4 years ago

Frustratingly there does seem to be interest from the wider community and end users in getting this feature in, the reason I did this in the first place is that I saw it was an issue, asking the SIGs if they were gonna fix it, and they said "we don't have the capacity but we'd be happy for you to do it".

@Joseph-Irving On this. As an active user, watching this thread because I'm interested (and so are two of my colleagues). Activity on the issue, pull request, slack channels or sigs might not be the best indicator of interest in this feature.

@dims Maybe you can shed some light?

pre commented 4 years ago

@thockin I listened you being interviewed in the Kubernetes Podcast a ~year ago and you talked about contributing to Kubernetes. Maybe it was you or somebody else in another podcast episode who felt really bad that this sidecar KEP didn't make it to 1.16. Well, here we are again.

This issue seems to be a prime example of how difficult it may be to contribute if you're not an employee of eg. Google, RedHat or other big player.

jaygorrell commented 4 years ago

I asked in Slack as well for help getting it reviewed but was just told there's an explicit hold by @thockin so I'm not sure the path forward either.

kfox1111 commented 4 years ago

I spent a lot of time on the ephemeral csi driver feature. Getting it through was similarly frustrating and there were times where I wasn't sure it was going to make it after so many delays and redesigns. So, I feel your pain. It would be great if we could find a way to make it less painful. That being said, we also did get it in eventually after missing a few major releases. So please don't give up/loose hope! The ship can be hard to turn, but it does eventually.

nicktrav commented 4 years ago

Anyone running any kind of topology that depends on a network sidecar is most likely hitting container startup / shutdown ordering issues that this KEP would potentially solve for. Ctrl-F for "Istio" on this ticket, and you'll see a bunch of annoyances related to container ordering.

Are there any Istio maintainers on here? A lot are Googlers, and might have some more sway with the K8s folks internally.

As an Istio / K8s shop, we're absolutely rooting for you getting this landed, @Joseph-Irving! ✊❤️

zhan849 commented 4 years ago

kudos for @Joseph-Irving for making sidecar this far.

Even for sidecar lifecycle management, any batch job would require this feature or Kubernetes just does not work for them, and that’s why we spent a lot of time also helping out code reviews and providing feedbacks!

We’ve been forking k8s for a while because of this and we’re really looking forward to see such important feature supported officially.

pluttrell commented 4 years ago

As an Istio + Kubernetes shop, we have also been waiting anxiously for this feature. And growing increasingly frustrated that it slips from release to release. We're not pleased to have to resort to hacks to kill the sidecars on job workloads. For our needs, this has been the single most important feature we need in Kubernetes, for well over a year.

@thockin It's been reported above that you've put an explicit hold on this. Can you please explain why.

wmorgan commented 4 years ago

There are a lot of Linkerd users who are eagerly waiting for this as well. Hang in there @Joseph-Irving, we're rooting for ya.