kubernetes / enhancements

Enhancements tracking repo for Kubernetes
Apache License 2.0
3.34k stars 1.44k forks source link

API Server tracing #647

Open dashpole opened 5 years ago

dashpole commented 5 years ago

Enhancement Description

Please to keep this description up to date. This will help the Enhancement Team track efficiently the evolution of the enhancement

ehashman commented 3 years ago

@dashpole great thanks - please submit a PR to update your KEP to target the new milestone.

/milestone v1.21

annajung commented 3 years ago

Hi @dashpole

Enhancements Freeze is 2 days away, Feb 9th EOD PST

Enhancements team is aware that KEP update is currently in progress (PR https://github.com/kubernetes/enhancements/pull/2428). Please make sure to work on PRR questionnaires and requirements and get it merged before the freeze. For PRR related questions or to boost the PR for PRR review, please reach out in slack #prod-readiness

Any enhancements that do not complete the following requirements by the freeze will require an exception.

annajung commented 3 years ago

With PR https://github.com/kubernetes/enhancements/pull/2428 merged in, this enhancement has met all the criteria for the enhancements freeze 👍

kendallroden commented 3 years ago

Hi @dashpole, Since your Enhancement is scheduled to be in 1.21, please keep in mind the important upcoming dates:

Thanks!

kendallroden commented 3 years ago

Hey @dashpole,

Checking in to confirm you are still in progress on the work for 1.21- saw the PR hasn't had much progress in the past 10 or so days!

As a side note looks like we should update the issue description above to remove 1.20 as the target for alpha.

Feel free to reach out :) Thanks!

ehashman commented 3 years ago

We're still blocked on some requested changes for the design, this is at risk but we're hoping it'll make it.

kendallroden commented 3 years ago

Totally understand @ehashman! Just a reminder that Code Freeze is on March 9th EOD PST (5 days from now)

Any enhancements that are not code complete by the freeze will be removed from the milestone and will require an exception to be added back.

Please also keep in mind that if this enhancement requires new docs or modification to existing docs, you'll need to follow the steps in the Open a placeholder PR doc to open a PR against k/website repo by March 16th EOD PST

Thanks!

annajung commented 3 years ago

Hi @dashpole, with code freeze now in effect, we are removing this enhancement from 1.21 release due to https://github.com/kubernetes/kubernetes/pull/94942 not being merged or approved.

Feel free to file an exception to add this back into the release. thanks!

ehashman commented 3 years ago

/milestone v1.22 :crossed_fingers:

salaxander commented 3 years ago

Hey @dashpole - 1.22 enhancements team here. Looks like we're on track so far, KEP looks up to date. Enhancements freeze is on 5/13, let us know if there's anything we can do to help before then!

salaxander commented 3 years ago

Hey @dashpole - 1.22 enhancements team checking in! We're just over 2 weeks away from code freeze, so I wanted to see if there were any open or merged k/k PRs we should be tracking for this. Thanks!

dashpole commented 3 years ago

Waiting on https://github.com/kubernetes/kubernetes/pull/94942, and there will be a follow-up to it as well that depends on the initial PR

dashpole commented 3 years ago

Follow-up tracing PRs are out and are ready for review:

In addition, I need to change kube-apiserver to apiserver as per this comment: https://github.com/kubernetes/kubernetes/pull/94942#pullrequestreview-692774419

We also need to figure out what to do with reentrant calls: https://github.com/kubernetes/kubernetes/issues/103186

salaxander commented 3 years ago

Hi @dashpole - One more ping since we're a week out from code freeze :) I see https://github.com/kubernetes/kubernetes/pull/94942 merged, but it looks like we're still waiting for the 3 follow-up PRs to merge?

Thanks!

dashpole commented 3 years ago

103218 also merged. We are waiting for #103216 and #103234. I also need to open a change to update the opentelemetry service name, and to add tracing to authn/z in the apiserver.

dashpole commented 3 years ago

I opened the last two PRs: #103435 and #103443

PI-Victor commented 3 years ago

Hello @dashpole 👋, 1.22 Docs release lead here. This enhancement is marked as ‘Needs Docs’ for 1.22 release.

Please follow the steps detailed in the documentation to open a PR against dev-1.22 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Fri July 9, 11:59 PM PDT.
 Also, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release. 
Thank you!

p.s.: please note we are 3 days away from deadline for the docs placeholder PR.

dashpole commented 3 years ago

@PI-Victor Here it is: https://github.com/kubernetes/website/pull/28843

salaxander commented 3 years ago

Hey @dashpole - I'll keep an eye out over the next day. Those last 2 PR's will need to merge before code freeze tomorrow to be included in 1.22

JamesLaverack commented 3 years ago

Hi, v1.22 Enhancements Lead here. Unfortunately this enhancement has not met the requirements for code freeze as https://github.com/kubernetes/kubernetes/pull/103443 is unmerged and unapproved.

If you still wish to progress this enhancement in v1.22, then please file an exception request.

/milestone clear

JamesLaverack commented 3 years ago

Hi, this enhancement has been granted an exception for v1.22. As per Savitha's message, please ensure that your PRs are merged by 14th July, end of day Pacific time. For this we are tracking the following two PRs:

dashpole commented 3 years ago

@JamesLaverack everything is complete

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

dashpole commented 2 years ago

/remove-lifecycle stale

gracenng commented 2 years ago

Hi @dashpole ! 1.24 Enhancements team here. Just checking in as we approach enhancements freeze on 18:00pm PT on Thursday Feb 3rd. This enhancements is targeting beta for 1.24, is that correct?. Here’s where this enhancement currently stands:

The status of this enhancement is track as at risk. Please update this issue description as well Thanks!

gracenng commented 2 years ago

The status for this enhancements have been updated to tracked for the 1.24 Enhancements Freeze 🥳

chrisnegus commented 2 years ago

Hi @dashpole 👋 1.24 Docs shadow here.

This enhancement is marked as 'Needs Docs' for the 1.24 release.

Please follow the steps detailed in the documentation to open a PR against the dev-1.24 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thu March 31, 11:59 PM PDT.

Also, if needed take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thanks!

gracenng commented 2 years ago

Hi @dashpole 1.24 Enhancements Team here, With Code Freeze approaching on 18:00 PDT Tuesday March 29th 2022, the enhancement status is at risk as there is no linked PR. Kindly list them in this issue. Thanks!

dashpole commented 2 years ago

@gracenng Thanks! I think the otel dependency update is still in the works. I'm also still trying to collect user feedback.

ehashman commented 2 years ago

/milestone clear

We won't be able to graduate this to beta in 1.24 as the etcd dependency we need to bump won't be ready before code freeze.

valaparthvi commented 2 years ago

Hi @dashpole :wave: 1.24 Release Comms team here.

We have an opt-in process for the feature blog delivery. If you would like to publish a feature blog for this issue in this cycle, then please opt in on this tracking sheet.

The deadline for submissions and the feature blog freeze is scheduled for 01:00 UTC Wednesday 23rd March 2022 / 18:00 PDT Tuesday 22nd March 2022. Other important dates for delivery and review are listed here: https://github.com/kubernetes/sig-release/tree/master/releases/release-1.24#timeline.

For reference, here is the blog for 1.23.

Please feel free to reach out any time to me or on the #release-comms channel with questions or comments.

Thanks!

mborsz commented 2 years ago

Hi @dashpole,

As discussed offline, it would be useful to add more spans to improve observability of API request latency. Currently, we use quite naive approach with text-based Traces: https://pkg.go.dev/k8s.io/utils/trace#Trace which logs if Trace duration takes more than specified threshold.

I think we should achieve parity between the old text-based Traces and the API Server Tracing feature. Preferably by refactoring Trace util to use single library for both (text and OpenTelemetry) approaches. Eventually we may want to deprecate text based Traces, once the new approach is GA and widely used.

You can find existing "steps" using query like: https://cs.github.com/kubernetes/kubernetes?q=trace.Step

More specifically, we are missing the following spans:

CatherineF-dev commented 1 year ago

I saw two ways to trace webhooks, which one is more preferred? cc @dashpole

From my understanding, server side can cover all kinds of webhooks (mutating/validating), but it doesn't contain network lantency from client sent request to server received request.

If the network lantency from client sent request to server received request doesn't matter, then server side solution is more preferred?

dashpole commented 1 year ago

You should definitely at least do client-side, as that will also handle context propagation from the initial request to the APIServer to the admission controller. Server-side is an added bonus, but would be especially useful if the webhook itself makes outgoing requests (I don't think that is as common).

CatherineF-dev commented 1 year ago

QQ: do we need to be concerned about full tracing propagation probability with independent tracing sampling?

For example, what’s the full tracing propagation probability if enabling etcd tracing with sampling rate = 10%, apiserver tracing with sampling rate = 10% and kubelet tracing with sampling rate = 10%. Is it 10%10%10%?

cc @dashpole

dashpole commented 1 year ago

@CatherineF-dev if all sampling rates are 10%, then each span has a 10% chance of being sampled, regardless of the path the request takes (or if context is propagated correctly).

CatherineF-dev commented 1 year ago

For a trace which contains kubelet-span1, apiserver-span2 and etcd-span3, does it mean the probability of having these three spans in that trace is 10%10%10%? Or 10%.

dashpole commented 1 year ago

10%. The sampling decision would be made at the first span in the tree. The sampling decision at the beginning would be respected by other spans in the tree (so the entire tree is either sampled, or not sampled)

rhockenbury commented 1 year ago

/milestone v1.26

parul5sahoo commented 1 year ago

Hello @dashpole 👋, 1.26 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PDT on Thursday 6th October 2022.

This enhancement is targeting for stage beta for 1.26 (correct me, if otherwise)

Here's where this enhancement currently stands:

For this KEP, we would just need to update the following before the enhancements freeze:

The status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

dashpole commented 1 year ago

@parul5sahoo KEP has been updated to the latest template, including the latest test plan

parul5sahoo commented 1 year ago

Hello @dashpole 👋, 1.26 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PDT on Thursday 6th October 2022.

This enhancement is targeting for stage beta for 1.26 (correct me, if otherwise)

Here’s where this enhancement currently stands:

With all the KEP requirements in place and merged into k/enhancements, this enhancement is all good for the upcoming enhancements freeze. 🚀

The status of this enhancement is marked as tracked. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

marosset commented 1 year ago

Hi @dashpole 👋,

Checking in once more as we approach 1.26 code freeze at 17:00 PDT on Tuesday 8th November 2022.

Please ensure the following items are completed:

For this enhancement, it looks like the following PRs are open and need to be merged before code freeze:

Also please be sure to update the issue description with the relevant PRs for tracking.

As always, we are here to help should questions come up. Thanks!

marosset commented 1 year ago

Hi @dashpole 👋,

Checking in once more as we approach 1.26 code freeze at 17:00 PDT on Tuesday 8th November 2022.

Please ensure the following items are completed:

For this enhancement, it looks like the following PRs are open and need to be merged before code freeze:

Also please update the issue description to link to all of the existing PRs for this enhancement.

As always, we are here to help should questions come up. Thanks!

Rishit-dagli commented 1 year ago

Hello @dashpole 👋 1.26 Release Docs shadow here!

This enhancement is marked as ‘Needs Docs’ for 1.26 release. Please follow the steps detailed in the documentation to open a PR against dev-1.26 branch in the k/website repo. This PR can be just a placeholder at this time, and must be created by November 9. Also, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thank you!

dashpole commented 1 year ago

@Rishit-dagli Placeholder draft PR: https://github.com/kubernetes/website/pull/37783

marosset commented 1 year ago

It looks like all of the code PRs have merged into k/k so I'm marking this enhancement as Tracked for v1.26. Thanks!

dashpole commented 1 year ago

@marosset The feature-gate change had to be reverted, and will not be completed for this release. APIServerTracing will not move to beta in 1.26

marosset commented 1 year ago

@marosset The feature-gate change had to be reverted, and will not be completed for this release. APIServerTracing will not move to beta in 1.26

Thanks for the update. @rhockenbury as FYI too

marosset commented 1 year ago

/milestone clear /label tracked/no /remove-label tracked/yes /remove-label lead-opted-in