kubernetes / community

Kubernetes community content
Apache License 2.0
11.98k stars 5.17k forks source link

Kubernetes LTS Brainstorming #2720

Closed justaugustus closed 5 years ago

justaugustus commented 6 years ago

Placeholder to tie together some of the threads on the potential of having Kubernetes LTS (Long-term support) releases:

This should coalesce into a KEP and we should find SIG / subproject ownership for it. Arch? Release? PM?

/kind feature /sig architecture release pm /lifecycle frozen

cc: @timothysc @dims @jimangel @tpepper @BenTheElder @detiber @neolit123

tpepper commented 6 years ago

/cc @imkin Dhawal Bhanushali is a VMware engineer interested in LTS

spiffxp commented 6 years ago

Kubecon 2017 talk on kernel vs. distro and the need for different release cadences: EDIT sched is annoying when it comes to copy-pasting uri's

Kubernetes: Kernels and Distros

timothysc commented 6 years ago

@thockin and I have have been in violent agreement on the idea, although we differ on how to get there.

I'm not a huge fan of LTS > 1 year for cluster managers, and this has been set forth by precedent in multiple projects.

thockin commented 6 years ago

While I think we need some story around LTS, I also do not think it is something that an open community of volunteers naturally gravitates towards.

If done poorly (actually, anything less than nearly-perfect) LTS may become an actively harmful thing. It is the thing users ask for, but I still question whether it is what they need.

On Mon, Oct 1, 2018 at 1:45 PM Timothy St. Clair notifications@github.com wrote:

@thockin https://github.com/thockin and I have have been in violent agreement on the idea, although we differ on how to get there.

I'm not a huge fan of LTS > 1 year for cluster managers, and this has been set forth by precedent in multiple projects.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2720#issuecomment-426056352, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVNrCtdywHRbU8G6iiiz8zCByNydAks5ugn7igaJpZM4W-_kw .

dims commented 6 years ago

Based on Tim's prompt "I still question whether it is what they need". Here's what i would ask the folks who say they want LTS.

Am sure we can formulate more questions like the ones above. We can take a poll and figure out what exactly folks want instead of the full-kitchen-sink-that-comes-with-a-LTS-label.

We have 2 upcoming kubecons to get this feedback right?

Thanks, Dims

On Mon, Oct 1, 2018 at 5:01 PM Tim Hockin notifications@github.com wrote:

While I think we need some story around LTS, I also do not think it is something that an open community of volunteers naturally gravitates towards.

If done poorly (actually, anything less than nearly-perfect) LTS may become an actively harmful thing. It is the thing users ask for, but I still question whether it is what they need.

On Mon, Oct 1, 2018 at 1:45 PM Timothy St. Clair <notifications@github.com

wrote:

@thockin https://github.com/thockin and I have have been in violent agreement on the idea, although we differ on how to get there.

I'm not a huge fan of LTS > 1 year for cluster managers, and this has been set forth by precedent in multiple projects.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/kubernetes/community/issues/2720#issuecomment-426056352 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AFVgVNrCtdywHRbU8G6iiiz8zCByNydAks5ugn7igaJpZM4W-_kw

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2720#issuecomment-426061593, or mute the thread https://github.com/notifications/unsubscribe-auth/AABbCCTZwl7f5ECB3wIW7gL-duUrkM45ks5ugoKwgaJpZM4W-_kw .

-- Davanum Srinivas :: https://twitter.com/dims

justaugustus commented 6 years ago

@dims -- Yep. This sounds like it could benefit from a survey. Let's see what we can do to mock something and then gather community feedback.

thockin commented 6 years ago

Another important question, IMO:

In an ideal world, what does compatibility really mean? How much does the Kubernetes version matter vs the individual API versions?

On Mon, Oct 1, 2018 at 3:44 PM Stephen Augustus notifications@github.com wrote:

@dims https://github.com/dims -- Yep. This sounds like it could benefit from a survey. Let's see what we can do to mock something and then gather community feedback.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2720#issuecomment-426088803, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVFYdh4nh3VFkN3RaIY8HNA1BMBpBks5ugprSgaJpZM4W-_kw .

tpepper commented 6 years ago

I totally agree this is not something a community of volunteers gravitate towards. I see a more likely outcome being more a vendor coalition. If there are people the vendors are today or will be in the future paying to do long term support anyway, what portion of their work is redundant and could better be done in a shared fashion?

One recent concrete example: check out Greg KH's talks on the linux kernel, LTS, and Spectre/Meltdown patches (eg: https://www.youtube.com/watch?v=lQZzm9z8g_U). He argues the distro's on their own don't get the most optimal support outcomes and where they're not working in common to solve thorny problems once, the backports are slow and fraught and tremendously expensive to accomplish. Yet they attempt it because there is user demand for it. Can we channel that to a common cause?

If one accepts there is some valid user demand for longer term support than 3-6months stable production (what I'd argue you get today on a 9mo support cycle, given time to fully get onto a new release and later to get off of it before it is EOL), then a well run single source of the core's long term support is the least expensive approach for the whole of the ecosystem, can encourage conformance, can diminish fragmentation, and is the most likely path to pragmatically achieve high quality in the effort, compared to each vendor doing their own thing.

A couple of us are hashing a proposal for the next SIG Release meeting: spin off a WG which pulls in broad stakeholder representation from our own k8s devs, the user/operator ranks, and vendors. If there's something there for requirements and the compromises can be sufficiently balanced, then turn that into an KEP for SIG Release to implement.

jagosan commented 6 years ago

I'm concerned that formal support for LTS versions of Kubernetes is doing a disservice to customers. I see customers often migrating from some legacy private cloud environment, lured by the promises of Cloud Native technologies. They spend a year containerizing everything, get it all just right, and then get burned when patch releases contain more than security fixes, or upgrades across minor releases are not backwards compatible. But stagnation / avoiding upgrades is also risky. The real benefit to Cloud Native technologies is in the dynamic stability, which requires embracing some degree of evolution. Rather than picking LTS versions of Kubernetes, I'd like to suggest a different way of looking at the concern, and a different approach to solving the issues.

I would argue the concerns are: 1/ upgrades break things; and 2/ compliance concerns.

Proposal to address without an LTS strategy might look something like the following: 1/ really, truly, don't put anything in a patch release that is not critical security fix or critical bug patch, and raise the bar for testing. 2/ test the upgrade/downgrade across patch releases. 3/ invest in testing upgrade/downgrade from last stable patch version of one minor release to the next minor release. 4/ bring critical APIs to v1, increasing confidence in the backward compatibility guarantees.

Were upgrades more reliable, and alpha/beta APIs not required for reasonable use of Kubernetes, the demand for LTS versions would be lower.

To address the second point (compliance) I'd like to propose that Kubernetes 1.x is the LTS version. Investing effort into safe upgrades and support for downgrade / rollback seems like a better strategy for pulling together the entire community than LTS of a single minor release.

analytik commented 6 years ago

For us (running on bare metal CoreOS), the desire for LTS boils down to not having to spend a lot of effort and risk breaking changes when upgrading.

For now we're still stuck on 1.5 as the effort for upgrading beyond it snowballed (etcd3, Docker, CoreOS, ingress, TLS, RBAC, kubeadm, etc). Once upgrading becomes just replacing binaries/images and reading a few release notes, then LTS loses its appeal.

idvoretskyi commented 6 years ago

/cc

fedebongio commented 6 years ago

Please count me in - I've been on the other side (customer of K8s) and I've also had the problem of the versions on my last company core product (Mule ESB at @mulesoft). Had countless discussions through the years, broke many customers upgrading to "theoretically" backward compatible patch versions.

I think the problem also expands to what mentioned on the side: a bunch of beta apis that are becoming part of the default set of APIs, that we need to invest into making them GA (sometimes rather than investing in new features).

bgrant0607 commented 5 years ago

I agree with Jago.

Ref https://github.com/kubernetes/community/issues/567

tpepper commented 5 years ago

FTR, I too personally bias towards rolling upgrades and the promise of the cloud native model. A key part of these discussions (regardless if they culminate in "LTS" as typically known, or other TBD changes that make k8s more deployable/consumable/manageable) must be defining some "customer" personas and seeking information those from folks who are running today or aspire soon to run production clusters. What are their requirements? Are there addressable needs unmet? Do they just need to upgrade more often, feed their breakage observations back to us, and trust we'll do better?

Some questions @jagosan :

To address the second point (compliance) I'd like to propose that Kubernetes 1.x is the LTS version. Investing effort into safe upgrades and support for downgrade / rollback seems like a better strategy for pulling together the entire community than LTS of a single minor release.

  1. Are you saying the "1" is the MAJOR? Ie: we aspire to actually doing what is semver? That is to say 1 is what it is and stable and compatible until we release 2.0.0? In this case, to which 1.x's would critical patches be backported? All, or only some? For how long would backports be done? What's the upgrade path? 1.x to 1.(x+1) only and only the newest 1.x gets upgrade to only 2.0? Or do some 1.x's get to upgrade to some other 2.x's? Semver basically implies the latter. But if we're to invest in testing and insuring it works ahead of users attempting it...that could be a big matrix.
  2. Or are you saying we have specific 1.x's, for some values of 'x', and that those are supported for some definition of "Long"? More concretely, are you saying that we already are doing LTS if we define that to mean there are three concurrently supported 1.x LTS releases active today, and "Long" is defined as 9 months? I do believe this is how we operate today...we're not semver at all but rather 1.MAJOR.minor.

In the latter case, and say like @bgrant0607 argues, we move towards more rapid releases (say 2 weeks for the sake of concrete argument, without going into the specifics of how to operationalize that because it's totally possible and a demonstrated pattern in the art)...how many of these do you think would be good to support? Eg: continue like today and support 3-ish prior releases (let's call this "N"), giving 6 weeks of support? Or continue like today and provide support for any release first shipping in the prior 9 months (let's call this "M") giving 18 support streams to which to backport? Is there a particular value of N or M which is "better" than others, for some definable criteria?

bgrant0607 commented 5 years ago

Most of the work required to achieve releases of higher quality and stability is independent of release frequency, such as:

Also, I am skeptical that it will be feasible for control-plane upgrades to skip minor releases in the foreseeable future.

quinton-hoole commented 5 years ago

It sounds like a working group to explore this space comprehensively and properly document in a white paper:

  1. a set of clear problem statements
  2. a set of possible solutions to the stated problems
  3. known pros and cons of the possible solutions

... would be useful.

The outcome will probably not be "we do LTS and this is how it works", but rather something more along the lines of "these are the most important problems with the current release process faced by type-X users/operators, and here are some concrete proposals for trying to address them".

Based on that we could solicit people to work on one or more of those efforts, which might be more stable releases, the option to adopt releases into production less frequently, LTS or whatever other approaches come out of the working group.

tpepper commented 5 years ago

fyi: WG LTS pr

https://github.com/kubernetes/community/pull/2911

bgrant0607 commented 5 years ago

Regarding "Upgrade path across more distant releases": https://github.com/kubernetes/community/blob/master/contributors/design-proposals/release/versioning.md#supported-releases-and-component-skew https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-parts-of-the-api https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#backward-compatibility-gotchas https://github.com/kubernetes/client-go#compatibility-client-go---kubernetes-clusters

neolit123 commented 5 years ago

just some quick 2c.

i proposed to @tpepper to expose all important related topics as proposals and available options in a list. once the list is created, discussions have to occur in SIG-arch and/or WG-LTS meetings covering the topics.

once the discussions are in place a voting system has to be established where a list of SIG-chairs and possibly WG chairs need to vote. in terms of who would be eligible can end up being a decision of the steering commitee - e.g. bring in active contributors or tech leads even if not SIG chairs.

the project is lacking a voting mechanic to promote ideas, away from endless discussions.

parispittman commented 5 years ago

@neolit123 re: voting. We've used the following: -CIVS -CNCF SurveyMonkey account that has voting style questions with a collaborative interface that allows for you to lead it vs CNCF -discuss.kubernetes.io has capabilities to do polling, likes, etc. depending on how you set up the thread.

quinton-hoole commented 5 years ago

I suggest that we give the working group the proposed bounded amount of time to come up with one or more recommendations, and prioritize them via our preferred consensus mechanism, including relevant SIG and working group leaders. In the unlikely event that consensus cannot be reached, we can fall back to voting, but I sincerely hope that that will not be necessary.

On Thu, Nov 8, 2018 at 11:12 AM Paris notifications@github.com wrote:

@neolit123 https://github.com/neolit123 re: voting. We've used the following: -CIVS -CNCF SurveyMonkey account that has voting style questions with a collaborative interface that allows for you to lead it vs CNCF -discuss.kubernetes.io has capabilities to do polling, likes, etc. depending on how you set up the thread.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2720#issuecomment-437120212, or mute the thread https://github.com/notifications/unsubscribe-auth/ApoAeNYWYiMY562rXPj4-wfgzam0--__ks5utIIVgaJpZM4W-_kw .

-- Quinton Hoole quinton@hoole.biz

bgrant0607 commented 5 years ago

A WG is fine as a forum for collaboration, but it isn't a decision-making entity.

I don't understand what purpose of the proposed vote would be.

tpepper commented 5 years ago

I can't speak for @neolit123 but I read his comment to be in a similar direction as: https://github.com/kubernetes/community/issues/2833

When I think about governance, I think about the Steering Committee. Steering's charter includes:

Similar perhaps for SIG Architecture and conformance definition.

In the end though I don't see this as an issue specific to WG LTS. At most WG LTS will bubble up proposals. These likely take the form of KEPs. There is a process for KEP approval which takes into account stakeholders.

dElogics commented 5 years ago

LTS is a necessity because with major updates comes regressions that you'll come to know about after months or weeks (into a point after which you cant rollback).

So for a stable successful project, you need LTS. LTS is know to be a success across distros for reason of stability.

tpepper commented 5 years ago

We have WG LTS which is the focal point for this brainstorming.

/close

k8s-ci-robot commented 5 years ago

@tpepper: Closing this issue.

In response to [this](https://github.com/kubernetes/community/issues/2720#issuecomment-489671511): >We have [WG LTS](https://git.k8s.io/community/wg-lts) which is the focal point for this brainstorming. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
gnydick commented 4 years ago

The long and the short of it is less than .0000001% of engineering/ops teams have the bandwidth to screw around with deploying, supporting, and upgrading a moving target.

It's common sense 101, you don't make your livelihood dependent on bleeding edge, yet anyone who adopts kubernetes is doing that.

The moment someone releases either a true competitor or a proper LTS, everyone will switch to it.

My $0.02 is that little things like renaming APIs, changing their version, etc. are necessary conceptually to evolve, but even just for the sake of having to update all of your configs and helm charts to deal with deprecation, renaming, etc. is burdensome and ridiculous.

The k8s curators need to realize this and take a more backwards compatible approach. This doesn't mean that infinite versions need to be supported, but FFS why should something called an Ingress that changes from foo/v1beta1 to some bar/v1 address in the API tree have to be reflected in our config files?

Create an Ingress, if you try to use options that are incompatible with your cluster, you'll get an error. If how you configure an option changes, shame on the authors for not doing a better job and being more forward looking.

This is the same kind of lack of overall experience that led to Golang's horrible problems around libraries, forking, etc. You know, stuff that has been solved for 50 years, but people running to fast with too little experience get supported by unicorn companies that can afford to let them run amok. I speak from first hand experience.

dims commented 4 years ago

@gnydick thanks! we are aware of the issues and talk about it all the time, we have a WG for LTS as well. Since you feel so strongly, please help us do better: https://github.com/kubernetes/community/tree/master/wg-lts

There's a slack channel, weekly meeting, notes/videos from previous meetings if you want to dig into what we have already looked at in the url above.

neolit123 commented 4 years ago

there is no right way to do API versioning. backwards compatibility between v1 and beta is plausible but not always the case out there.

the general response to this problem is that k8s needs to enter a more widely present v1 state. non-v1 APIs are simply WIP and most likely the k8s project needs help with moving them faster to v1 too.

loganmzz commented 3 years ago

@gnydick thanks! we are aware of the issues and talk about it all the time, we have a WG for LTS as well. Since you feel so strongly, please help us do better: https://github.com/kubernetes/community/tree/master/wg-lts

Link is broken

BenTheElder commented 3 years ago

It's now https://github.com/kubernetes/community/tree/master/archive/wg-lts, wg-lts was concluded 9 months ago.

vtrenton commented 1 year ago

I think there is a lot of confusion around what kubernetes is and isn't. Kubernetes seems to be sold as a tool you can plug into your traditional slow moving monolithic infrastructure that will make you "modernized" and "cloud native" which is extremely wrong. Kubernetes is a tool for an organization that is dramatically restructuring IT Operations and DevOps teams. To much more agile designs where continuous integration and continuous delivery control the constant flow of change. Rather of being fearful of touching that server that's been running for the last 5 years. These organizations embrace small failures with quick feedback and recovery. These organizations tend to prefer declarative and immutable infrastructure as code design as opposed to imperative models where each server is frequently logged into to have regular maintenance and changes. Kubernetes needs to move at the speed of the code running on it - not the other way around...

This is a problem of marketing - if you can't keep up. Honestly you need to look internally and see what methodologies you can adopt to move more quickly within your organization. Is it easy? No absolutely not. But I will be the first to say kubernetes is not for everyone. If you don't have a solid process that welcomes constantly changing then don't adopt it. Use the right tool for the job don't use a socket wrench to hit in a nail then try to change the socket wrench...

gnydick commented 1 year ago

It doesn't really matter what kubernetes is or isn't. Pretend we're having this discussion about any other piece of software, all of the criticisms apply. The fact is, it is bleeding edge in production means that it isn't a stable release. My infrastructure at work, all built from scratch, is not traditional slow moving monolithic infrastructure. Yet, multiple times a year we have to redeploy, replace, and sometimes reengineer systems that people rely on because something changed out from under us that we have no control over. This is partially AWS's fault because they force you to upgrade your EKS clusters, but it's also k8s fault as well.

Just some thoughts

I know there's a lot of snobbery in the tech community, and I feel like I smell it here.

Frankly, it might be less work to re-write the small sub-set that is what 99% of people need from k8s into something much more easily supported, implemented, and extended.

On Thu, Mar 23, 2023 at 7:06 PM Trent V. @.***> wrote:

I think there is a lot of confusion around what kubernetes is and isn't. Kubernetes seems to be sold as a tool you can plug into your traditional slow moving monolithic infrastructure that will make you "modernized" and "cloud native" which is extremely wrong. Kubernetes is a tool for an organization that is dramatically restructuring IT Operations and DevOps teams. To much more agile designs where continuous integration and continuous delivery control the constant flow of change. Rather of being fearful of touching that server that's been running for the last 5 years. These organizations embrace small failures with quick feedback and recovery. These organizations tend to prefer declarative and immutable infrastructure as code design as opposed to imperative models where each server is frequently logged into to have regular maintenance and changes. Kubernetes needs to move at the speed of the code running on it - not the other way around...

This is a problem of marketing - if you can't keep up. Honestly you need to look internally and see what methodologies you can adopt to move more quickly within your organization. Is it easy? No absolutely not. But I will be the first to say kubernetes is not for everyone. If you don't have a solid process that welcomes constantly changing then don't adopt it.

— Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2720#issuecomment-1482150702, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACFHREW47IS5UUJO255QLDW5T6RTANCNFSM4FX37EYA . You are receiving this because you were mentioned.Message ID: @.***>

BenTheElder commented 1 year ago

Folks: This is an issue that was closed years ago, many of the people who previously commented here have moved on and few will see these comments, this is not a good place to actively discuss.

However, I see this, and there are many people in the community working to make this sort of thing is smoother. Releases are also less frequent and supported for longer now.

  • Be more conservative with what features are encouraged to be used

Only GA APIs are on by default now, there's been progress since this issue was closed and discussions moved elswhere https://github.com/kubernetes/enhancements/issues/3136

The deprecation policy has also been expanded https://kubernetes.io/docs/reference/using-api/deprecation-policy/

  • If you're going to completely rename API's, or anything else 100% incompatible, think how that impacts the people using this stuff: plan ahead?

There's a whole group dedicated just to reviewing API changes of any sort with a strong push towards ensuring smooth upgrades and stability. All features also must go through a Production Readiness Review process now, which we're iterating on towards trying to ensure this improves platform stability.

  • For hard-breaking changes, make migration tools to convert our yaml

Kubernetes has made migration tools, including for example kubectl-convert

https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-kubectl-convert-plugin https://kubernetes.io/docs/reference/using-api/deprecation-guide/#migrate-to-non-deprecated-apis

I know there's a lot of snobbery in the tech community, and I feel like I

This comment seems less than helpful. Many people are working hard to ensure this free and open source project works well for everyone.

gnydick commented 1 year ago

Thank you for the updates, Benjamin. I truly appreciate all of them.

It's worth noting though, that there isn't a mutual exclusivity between having many people working hard to ensure it works for everyone AND having a snobbish community. That explanation is akin to "the ends justify the means."

I'm just saying we all have good intentions, but everyone of us has suffered the pitfalls of passion and reacted/communicated less than ideally. We also should be honest with each other and not be easily offended by someone who feels they've had bad experiences and calling it out. Nobody's perception is invalid, and if one person feels that way, chances are someone else does as well.

Thanks again, Ben.

Gabe

On Fri, Mar 24, 2023 at 1:09 PM Benjamin Elder @.***> wrote:

Folks: This is an issue that was closed years ago, many of the people who previously commented here have moved on and few will see these comments, this is not a good place to actively discuss.

However, I see this, and there are many people in the community working to make this sort of thing is smoother.

  • Be more conservative with what features are encouraged to be used

Only GA APIs are on by default now, there's been progress since this issue was closed and discussions moved elswhere kubernetes/enhancements#3136 https://github.com/kubernetes/enhancements/issues/3136

The deprecation policy has also been expanded https://kubernetes.io/docs/reference/using-api/deprecation-policy/

  • If you're going to completely rename API's, or anything else 100% incompatible, think how that impacts the people using this stuff: plan ahead?

There's a whole group dedicated just to reviewing API changes of any sort with a strong push towards ensuring smooth upgrades and stability. All features also must go through a Production Readiness Review https://github.com/kubernetes/community/blob/master/sig-architecture/production-readiness.md process now, which we're iterating on towards trying to ensure this improves platform stability.

  • For hard-breaking changes, make migration tools to convert our yaml

Kubernetes has made migration tools, including for example kubectl-convert

https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-kubectl-convert-plugin

https://kubernetes.io/docs/reference/using-api/deprecation-guide/#migrate-to-non-deprecated-apis

I know there's a lot of snobbery in the tech community, and I feel like I

This comment seems less than helpful. Many people are working hard to ensure this free and open source project works well for everyone.

— Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2720#issuecomment-1483350747, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACFHRF6VLLPN273UR3RXIDW5X5O5ANCNFSM4FX37EYA . You are receiving this because you were mentioned.Message ID: @.***>