kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
109.87k stars 39.33k forks source link

Re-entrant apiserver calls are not part of the same trace #103186

Open dashpole opened 3 years ago

dashpole commented 3 years ago

Part of enhancement: kubernetes/enhancements#647

This is a continuation of the discussion in https://github.com/kubernetes/kubernetes/pull/94942#discussion_r657114027, which was a discussion of the risks of abuse that might be possible To summarize, the otelhttp.WithPublicEndpoint() option causes spans generated by the API Server handler to be the start of a new trace (with a link to the incoming trace), rather than be children of the incoming request.

Note: I think there is a bug in OpenTelemetry in which parent contexts are still used for sampling decisions: https://github.com/open-telemetry/opentelemetry-go/issues/2031. I'll make sure that is fixed separately.

However, one of the use-cases outlined in the KEP is tracing reentrant API calls. Using WithPublicEndpoint make reentrant calls difficult to visualize, because they are each their own trace now. We should probably keep the default behavior as using WithPublicEndpoint, but expose an option to make it non-public. But first, we should discuss and document what the abuse implications are for an actor with the ability to send requests to the kubernetes API with a span context. The implications discussed thus far are:

One unanswered question from the thread from @liggitt:

@lavalamp, you've thought about the abuse vectors more w.r.t. priority and fairness... do you have thoughts on this?

dashpole commented 3 years ago

/sig instrumentation /sig apimachinery

cc @logicalhan @lilic

k8s-ci-robot commented 3 years ago

@dashpole: The label(s) sig/apimachinery cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes/kubernetes/issues/103186#issuecomment-868641693): >/sig instrumentation >/sig apimachinery > >cc @logicalhan @lilic Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dashpole commented 3 years ago

/sig api-machinery

logicalhan commented 3 years ago

/assign

lavalamp commented 3 years ago

We should never be adding to user-controlled traces, IMO.

apiserver should add baggage to its trace headers. Rentrant calls should repeat the baggage. Then the 2nd apiserver can use this to verify that the trace was originated by some apiserver in the cluster. In that case, it should append to a trace. Otherwise, it should start a new one. This should not be an option for the user. There are multiple ways of constructing such a baggage value.

If you have a strong use case for a user starting a trace, let me know, but I don't think it's going to outweigh the anti-abuse perspective.

logicalhan commented 3 years ago

If you have a strong use case for a user starting a trace, let me know, but I don't think it's going to outweigh the anti-abuse perspective.

I'd like to be able to run kubectl apply -f blah.yaml --trace but I am indifferent as to whether that actually attaches a trace directly or runs through some webhook which validates who i am and then attaches a trace to the request.

lavalamp commented 3 years ago

I don't see how webhooks could attach a trace.

Letting users request apiserver initiate a trace is different than attaching a request to an already created trace, no?

If we let users request extra resource usage (trace allocation), then IMO we have to authz that somehow.

logicalhan commented 3 years ago

Maybe I'm phrasing this incorrectly. We can have some mechanism by which either kubectl attaches a trace header directly, or attaches some sort of proxy pseudo trace header which the apiserver can then, after authz-ing (via a webhook or whatever), decide to actually attach a trace to the request.

lavalamp commented 3 years ago

Maybe a preexisting trace makes sense if folks are using api calls.

I guess my concern is just bounding the resource usage. A non-authz solution might look like a cap on the %age of requests that get traced, and user-requested traces count against that cap (and are ignored if it would put it over the cap).

I don't think it works this way, but I definitely don't want apiserver dialing out to some user-controlled address to report on a trace.

On Fri, Jun 25, 2021 at 10:50 AM Han Kang @.***> wrote:

Maybe I'm phrasing this incorrectly. We can have some mechanism by which either kubectl attaches a trace header directly, or attaches some sort of proxy pseudo trace header which the apiserver can then, after authz-ing (via a webhook or whatever), decide to actually attach a trace to the request.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/103186#issuecomment-868733977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE6BFVBOREZFHM3R25YA6TTUS6XVANCNFSM47KDTUTA .

logicalhan commented 3 years ago

I don't think it works this way, but I definitely don't want apiserver dialing out to some user-controlled address to report on a trace.

Yeah that's not really what I was saying.

I guess my concern is just bounding the resource usage. A non-authz solution might look like a cap on the %age of requests that get traced, and user-requested traces count against that cap (and are ignored if it would put it over the cap).

You would probably require separate quotas, generally trace sampling is super super low (think 1 in a hundred thousand requests), but yeah I'm a fan of quotas in general.

fedebongio commented 3 years ago

/triage accepted

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

dashpole commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes/kubernetes/issues/103186#issuecomment-1058181298): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dashpole commented 2 years ago

/reopen /remove-lifecycle rotten

k8s-ci-robot commented 2 years ago

@dashpole: Reopened this issue.

In response to [this](https://github.com/kubernetes/kubernetes/issues/103186#issuecomment-1058199783): >/reopen >/remove-lifecycle rotten Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

dashpole commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

logicalhan commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

dashpole commented 1 year ago

/remove-lifecycle rotten

dashpole commented 1 year ago

cc @Richabanker

mterhar commented 1 year ago

Would an acceptable solution be to have an option for internal-api-endpoints? If I have network-boundary around my apiserver, then I trust incoming traceparent headers and would propagate the headers?

Including that as a command argument or part of the tracingconfiguration seem reasonable. If someone suggests a path that would be accepted, I'll happily take on implementing it.

dashpole commented 1 year ago

That is a reasonable approach. We need to think about:

dprotaso commented 11 months ago

A related use case I have is I want to debug some issues that appear in tests when my webhooks are invoked - they return a 200 response but the API server is failing and returning an EOF context timeout.

To debug this I would like the trace to start when the test makes the call to the API server. For this to be useful I would expect the API server to attach the trace context to outgoing requests.

If I could enable this cluster wide for testing/debugging purposes that would be great.

KubeKyrie commented 7 months ago

That is a reasonable approach. We need to think about:

  • Should this only exist in the apiserver config? Or is it applicable to all kubernetes components, and belong in component-base?
  • Do we need per-http-server configuration, or is a single option enough? IIRC, there was more than one http server instrumented for apiserver tracing.

I think @mterhar propose a good solution. Seems only exist in the apiserver config is ok at this stage. And both a single option and per-http-server are ok. maybe a single option is more generic?

dashpole commented 6 months ago

I have another option i'd like to consider, since I would like to avoid an additional config parameter:

If we move the tracing filter to after the authentication filter, we can ensure only authenticated requests are traced. This has the downside of not capturing latency incurred by authentication, but will fix this issue. I plan to work on a fix in OpenTelemetry to this issue: https://github.com/open-telemetry/opentelemetry-go/issues/4993. But moving the filter to after authentication will unblock us for now, without the bad UX implications of not respecting the incoming trace context.

liggitt commented 6 months ago

If we move the tracing filter to after the authentication filter, we can ensure only authenticated requests are traced. This has the downside of not capturing latency incurred by authentication, but will fix this issue

does it? does that require treating all authenticated users as trustworthy and assuming they won't pollute each other's traces?

dashpole commented 6 months ago

does it? does that require treating all authenticated users as trustworthy and assuming they won't pollute each other's traces?

It assumes all authenticated users are not malicious (e.g. they aren't trying to DDOS the apiserver). Someone could, in theory, try to start traces with the same trace ID as another request to try to make the trace from the other request unusable. But they would need to get access to the trace ID in the header of the other request, which doesn't seem likely.

From https://www.w3.org/TR/trace-context-2/#security-considerations:

There are two types of potential security risks associated with this specification: information exposure and denial-of-service attacks against the vendor.

Information exposure applies to outgoing requests, so our primary concern is around denial-of-service, which is covered here: https://www.w3.org/TR/trace-context-2/#denial-of-service

When distributed tracing is enabled on a service with a public API and naively continues any trace with the sampled flag set, a malicious attacker could overwhelm an application with tracing overhead, forge trace-id collisions that make monitoring data unusable, or run up your tracing bill with your SaaS tracing vendor.

Tracing vendors and platforms should account for these situations and make sure that checks and balances are in place to protect denial of monitoring by malicious or badly authored callers.

One example of such protection may be different tracing behavior for authenticated and unauthenticated requests. Various rate limiters for data recording can also be implemented.

I am hoping to take the example approach, and do tracing only for authenticated requests. You mention "authenticated but untrusted" users here. We are able to restrict the ability to influence tracing based on the incoming request. Is there a way to tell if a request would fall into the "authenticated but untrusted" category?