Support multiple ports in a revision

skonto commented 4 years ago

In what area(s)?

/area API

/area autoscale /area build /area monitoring /area networking /area test-and-release

Other classifications:

/kind good-first-issue /kind process

/kind spec

Describe the feature

A developer using Knative Serving when creating applications that consist of one or more containers should be able to specify the desired port to be managed by Knative and not to be restricted to: one port per serving container and no ports for the other not serving containers eg. sidecars. Note here that there is no intention to add support for multiple-ports managed by Knative but users should be able to specify other ports freely.This has been discussed previously here and there were requests here and here.

I have started a proposal here

vagababov commented 4 years ago

So what's the point of creation of this issue if we have #7140?

skonto commented 4 years ago

Hi @vagababov. Afaik #7140 is a question, I am making it a feature request and I am adding a proposal. Any reason not to? I don't see an overlap unless I am missing something here. cc @markusthoemmes

vagababov commented 4 years ago

:shrug: seems redundant to have several things about the same. We need to close one of them.

skonto commented 4 years ago

@vagababov ok compaction is good as long as we can summarize all the other discussions, then we can close the other issues and point to to this one. Thus let's close whatever is related to this one.

astefanutti commented 4 years ago

Specifically for the monitoring use case with the Prometheus operator, the current limitation prevents from using the PodMonitoring API. As an example, we would need this for apache/camel-k#1555.

mattmoor commented 4 years ago

regarding the comment about prometheus keying off of port name, my understanding was that it could also key off of annotations: https://github.com/projectcontour/contour/blob/bfbdb982dd932ab02a4fdec9a7e2aeb1760400b5/examples/contour/03-envoy.yaml#L19-L22

I wonder if for scraping this is sufficient (it the push option @julz mentioned is not)

dprotaso commented 4 years ago

This was chatted about at the WG meeting today

~Will post the recording when it's available~ - it's up

mattmoor commented 4 years ago

I consolidated the issues so that we can track the request in terms of this issue.

What's the current status of this work? I forget where we left things in the WG meeting, it was long enough ago 😬

evankanderson commented 4 years ago

I think our forward-looking guidance for metrics is that Knative isn't particularly doing anything "special" Kubernetes-wise, and metrics observation should be cluster-wide infrastructure (except that inbound scraping via Prometheus is a bit tricky, and users probably want to use the Prometheus push gateway or the OpenCensus Collector for application metrics).

evankanderson commented 4 years ago

Note that the inbound scraping via Prometheus has the following challenges:

It may be hard to get the scape port exposed/authorized (this issue)
It may be hard to get an initial scrape at pod startup time, so some initial metric events might not be collected depending on how the metric is expressed.
It may be hard to coordinate final Prometheus metrics fetch with pod shutdown, which may cause some metrics to be dropped.
Depending on the Service usage, it's possible that you might end up with only a single scrape on a given Service's container before it's shut down.

A push protocol (either PushGateway or OpenCensus Collector) avoids these problems.

evankanderson commented 4 years ago

Amusingly, from the Prometheus blog post about pull vs push:

But now my monitoring needs to know about my service instances!

With a pull-based approach, your monitoring system needs to know which service instances exist and how to connect to them. Some people are worried about the extra configuration this requires on the part of the monitoring system and see this as an operational scalability problem.

We would argue that you cannot escape this configuration effort for serious monitoring setups in any case: if your monitoring system doesn't know what the world should look like and which monitored service instances should be there, how would it be able to tell when an instance just never reports in, is down due to an outage, or really is no longer meant to exist? This is only acceptable if you never care about the health of individual instances at all, like when you only run ephemeral workers where it is sufficient for a large-enough number of them to report in some result. Most environments are not exclusively like that.

(emphasis mine -- this is exactly the Knative Service case)

jeffgriffith commented 4 years ago

hi @evankanderson not sure i followed this. Activation and scale-to-0 aside, Knative Services are considered ephemeral in a way that most services are not? whether my ksvc comes and goes, it contains valuable (in the aggregate) internal information about its performance so i can't quite see why its metrics are somehow less valuable by virtue of being a ksvc. in my case, the service is long-lived i'd like to discover it with a ServiceMonitor/annotation, etc per our standard infra practices.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

skonto commented 3 years ago

@mattmoor @evankanderson should we finalize the design doc review or abandon it? I suspect current users are not all using OpenTelemetry and that kind of architecture based on a collector. A push model is a great fit due to the short-lived containers (especially if used in a FaaS context) and I agree completely on that in order to be on the safe side of things (eg. scrape interval is not small enough). I am not sure about the users though and what people use out there to integrate with Serving. If they know the limitations and they know what they are doing then they need some flexibility.

evankanderson commented 3 years ago

/assign

/triage accepted

skonto commented 3 years ago

@evankanderson should I finalize the doc? What is the plan? How can I help with the implementation :)

evankanderson commented 3 years ago

/unassign

I put some comments in the doc a while ago -- it seems like a lot of the doc focuses on "how to expose Knative-created Pods for direct network access outside the queue-proxy lifecycle" without covering how to prevent users from building applications where "Knative doesn't work" because their network-attached Pod got scaled down due to inactivity.

It might be able to make progress on this as a feature track if we focus on the monitoring/data collection cases. It would be good if the proposed solution worked for the mesh as well as no-mesh cases; I seem to recall that mesh-mode autoscaling had to jump through some hoops.

evankanderson commented 3 years ago

In terms of review, we should probably get a reviewer from Serving, Networking, and Autoscaling to flag specific concerns; I've tried to proxy those groups, but I'm not intimately familiar with some of the code details.

evankanderson commented 3 years ago

For the actual feature proposal, it's probably also worth specifying a particular mechanism for identifying the primary port and container; ideally a mechanism which works for both single and multi-container workloads.

evankanderson commented 2 years ago

Some further musings, having seen a second monitoring use-case for this which didn't support the Prometheus annotations -- it's also possible (meaning there's nothing we do to prevent it) to define a Kubernetes Service which references the pods of a Knative Service. If you're aiming to do monitoring (i.e. scrape each Pod) and not provide a central service for managing them, you can create a headless service as follows:

apiVersion: v1
kind: Service
metadata:
  name: knative-monitor
spec:
  clusterIP: "None"
  ports:
  - protocol: TCP
    port: 8181
  selector:
    serving.knative.dev/service: sockeye

Note that you need one such Service for each Knative Service, because the Service selector is a map[string]string, and not a LabelSelector.

This has the semi-nice property that none of the traffic is tracked or managed by the queue-proxy, so doesn't participate in the request lifecycle at all. For a monitoring / all-endpoints scrape, this is reasonable appropriate, though it does introduce the risk of terminating a container halfway through answering a monitoring request.

evankanderson commented 2 years ago

I only post the above work-around because I worry that the following UX work is substantial:

For the actual feature proposal, it's probably also worth specifying a particular mechanism for identifying the primary port and container; ideally a mechanism which works for both single and multi-container workloads.

Additional requirements might include:

It shouldn't introduce extra overhead for the single-container, single-port case
It should be clear and obvious which container port is the one which is used for determining concurrency for scaling (and which gets the rest of the lifecycle management perks, like draining connections / graceful termination)

evankanderson commented 2 years ago

Extracting from an offline conversation:

There are a buncrh of differences between the “serving port” and some of the other ports, and it’s not yet clear exactly how to codify the differences:

serving port:

Receives external traffic
- Utilized by features like auto-TLS and Addressable
Requests can reach any pod (all pods equivalent)
Affects traffic scaling
Concurrency controls
Singular (or all ports equivalent, at which point, why have more than one?)

non-serving ports:

Cluster-local traffic only
Does not affect scaling
Generally requests need to go to a specific Pod (headless Kubernetes Service / direct IP connectivity)
No concurrency controls
Does not affect lifecycle — could be shut down while handling one of these requests

I have concerns about trying to mix the two sets of very different semantics in the same list next to each other.

One disadvantage of the approach in https://github.com/knative/serving/issues/8471#issuecomment-1021560433 is that it does not allow different Revisions to publish the same application port on different container port numbers (which seems desirable from a flexibility point of view, even if it's not required). Having this metadata attached to a Revision does feel useful, but it also feels like there may be additional functionality needed/desired -- for example, exposing a kubernetes headless Service which provides Endpoints / EndpointsSlice access to the Pods in an equivalency group (Service? Configuration? Revision?).

skonto commented 2 years ago

Hi @evankanderson

This has the semi-nice property that none of the traffic is tracked or managed by the queue-proxy, so doesn't participate in the request lifecycle at all. For a monitoring / all-endpoints scrape, this is reasonable appropriate, though it does introduce the risk of terminating a container halfway through answering a monitoring request.

This is something we have been doing downstream for quite some time to avoid the interference with the app requests. Creating a service with a service monitor is one standard way. Having a headless one is perfectly fine.

Having this metadata attached to a Revision does feel useful, but it also feels like there may be additional functionality needed/desired -- for example, exposing a kubernetes headless Service which provides Endpoints / EndpointsSlice access to the Pods in an equivalency group (Service? Configuration? Revision?).

For Prometheus specifically there are alternatives. In general regarding the Prometheus annotations these have to be part of the pods not service etc. However using a service monitor as described above allows to define there the metrics http path and other config. The question I see is whether we want to extend the immutable config of a revision to allow more stuff without affecting Knative. This I think boils down to allowing the full K8s pod spec but in a away that does not interfere with Knative eg. still support one port for autoscaling etc and use a convention to ignore other containers and ports. Allowing the full spec has come up many times before afaik.

ashrafguitoni commented 2 years ago

Looking forward to seeing this feature!

knative / serving