Support exposing multiple ports in a sidecar proxy.

vinay-g commented 5 years ago

Feature Description

Add feature so that consul proxy can serve multiple services via single proxy. Currently via envoy or consul only a single service can be served via a proxy.

Use Case(s)

Consider a pod consists of multiple containers, each container listens to a specific port. Currently via connect only one service per pod can be served. Instead of creating one proxy per service, if a single proxy can handle multiple services it would reduce resource consumption and simplify the design.

suchisubhra commented 5 years ago

Hi Guys,
we really like to have this feature and like to know how to escalate and if possible how to contribute. Is there any way you guys let us know what is the procedure? Really appreciate your help in advance.

mkeeler commented 5 years ago

@suchisubhra In general adding a 👍 to the top level issue is a good start. Then if you have other information to add about your particular use case you would leave that in the comment.

msuarezd commented 5 years ago

We are using nomad with the java driver and fat jars. Currently we need to start an envoy sidecar for each service deployed on a node. Being able to register all services (sidecars) in the local agent to an unique envoy instance would simplify our setup drastically. It would also help with other topics like collecting statistics over Prometheus, etc.

banks commented 5 years ago

@msuarezd one issue with the model you propose is that any workload that nomad might schedule on any host becomes "trusted". In other words none of the security benefits of Connect Intentions would really hold between any workloads that are allowed to be scheduled behind the same "sidecar".

In a typical nomad cluster where all jobs can be scheduled on any machine that basically means that Connect Intentions can't really enforce anything in your proposed setup unless you are explicitly limiting jobs that are logically the same "trusted unit" only scheduled together.

So while we don't rule out supporting that, it would come with a lot of caveats as it basically relies on the scheduler configuration to ensure any of the security properties of Connect.

One of the main reasons for this issue is that currently Consul only supports a single port per service so to model a pod that exposes different protocol/APIs on different ports, you have to register each as a separate service to be individually discovered but then need a way to configure a single sidecar for all of the services. The problem now is that as far as Consul Connect is concerned, the three services are different "identities" when they might all be the same process.

It's ambiguous whether they should be different "identities" - in some cases they might just be alternate protocols giving access to same stuff so should be the same "identity" (and have same Intention rules applied" in other cases they might be legitimately different - e.g. a public port and an internal port that should have different access control.

So the ultimate solution here probably involves making Consul Services more flexible so that you can expose multiple ports for the same logical service. We've not made a firm decision on that but it seems the cleanest way to model @vinay-g's use case. The question then is whether you then need to be able to change Intentions based on which port as well as just service name. That's much less obvious as the right thing to do.

I think we'll certainly have to come back and consider this more before too long but any additional info people can share on their use-cases would be really helpful. Specifically:

Do your multiple ports need different intentions defined?
Is it OK that all the services on a shared Envoy can access all of each other's upstreams? e.g. if web is allowed to access db but billing is not, running both web and billing behind the same Envoy would allow billing (if compromised by malicious user/code exploit etc) to access db defeating intention graph.

msuarezd commented 5 years ago

Hi @banks, thank you for the feedback and sorry for the late Answer.

As you say, every service would be able to access the upstreams of other services running on the same host. But this is the case anyway if we have one sidecar pro job. If I have service A and B on the same nomad node, nothing keeps service B to access the upstreams from server A which are listening to localhost on the same node (even if they are handled by different envoy instances). We would have some benefits from connect however: TLS encryption in all inter-node communication, upstream load balancing, easier configuration standard, observability, etc.

To your specific questions:

Do your multiple ports need different intentions defined?

No. What I mean is a way to expose all services registered to a consul agent to be exposed over the same envoy. A way to point envoy to the consul agent at localhost and receive the configuration for all sidecar proxies registered on that agent. Envoy would act like an edge router on every node for each service running there.

Is it OK that all the services on a shared Envoy can access all of each other's upstreams? e.g. if web is allowed to access db but billing is not, running both web and billing behind the same Envoy would allow billing (if compromised by malicious user/code exploit etc) to access db defeating intention graph.

Yes. It would be lovely to restrict that, but to my understanding one would have to isolate those services on different network namespaces with an own sidecar proxy on it (like a kubernetes pod). And that is not possible with the java driver at the moment. But as there is no way to enforce intentions for services running on the same node for the java or exe driver anyway (correct me if I am wrong!), we could still profit from some other features of connect.

I hope I could help

banks commented 5 years ago

Thanks. That's useful - we certainly will need a solution to this and as you said the "multiple ports and services in one pod" we're considering the best option there.

bencyoung commented 4 years ago

This would also benefit us, as we're not so interested in the security side of connect but using it in an environment where a legacy app can't be changed to talk to consul directly and the environment is locked down do that we can't use DNS either! We are looking to use Connect proxies to enable these services to do service discovery by it means a lot of proxies having to be spun up manually!

jsosulska commented 4 years ago

Hi Folks,

Please track #6357 as the top-level ticket for this question. This current ticket, #5388, is marked as duplicate and will be closed.

Thank you!

jsosulska commented 4 years ago

Hi Folks - Going to reopen this and provide some context for what this issue is going to track.

The Problem: Currently, there is a one to one mapping of service and exposed port back to the sidecar proxy. This causes issues for pods that expose multiple service ports, like for different functionality. An example of this is when a service has a public HTTP port, and a more privileged admin port. Currently, the proxy is unable to determine which port the traffic should end up being routed to.

The Clarification: Neither Ingress nor Terminating gateways solve the problem described here.

Both Connect mesh gateways and Ingress gateways run into this problem where a single proxy does not accurately reflect multiple exposed listeners for the service, and can't route traffic in those cases. However, today, the functionality exists to define each exposed listener as a unique service, and route traffic using those unique service names.

We definitely want to solve this in the near future, but we recognize that any solution will likely have larger consequences on ACLs, intentions, and the catalog & addressing models.

Renaming this to: Feature Request - Support exposing multiple ports in a sidecar proxy.

For additional Context: For inbound, this can be solved by SNI headers, where a different cert is used based on the SNI(server name indication) header provided by the client. This is what is being done today by Ingress and Terminating gateways.

For outbound, this becomes more difficult, as the proxy doesn't have a way to differentiate & define a certificate of the originating service's exposed subsystem.

Ingress gateways solve the use case of non-Connect traffic to Connect traffic to external (non-connect) traffic. Terminating gateway solve the reverse use case, where Connect enabled services are communicating out of the service mesh to services.

tunhvn commented 4 years ago

I have the same issue. Currently I have a service which serve on multiple ports but sidecar proxy only support binding one-to-one port. I have to register multiple services for multiple ports and end up with one service - multiple ports - multiple proxies.

whiskeysierra commented 3 years ago

Does the Consul Connect sidecar actually support multiple injections on the same pod? How can I workaround this currently in a kubernetes environment?

banks commented 3 years ago

@whiskeysierra Consul's Kubernetes integration/helm chart currently doesn't support injecting multiple proxies for you no.

It's possible to do it manually by adding your own envoy containers and appropriate init containers that setup the envoy bootstrap for them to you pod spec (or alternatively writing a custom mutating web hook that will do that for you - could be based on the hashicorp/consul-k8s code even).

I don't meant to share that as a reasonable workaround, just mentioning what it would take to get a solution in the current state!

Hope this is helpful.

david-yu commented 3 years ago

Hi folks, Consul Kubernetes PM here doing some research on this feature request related to multiple ports and/or multiple services per Envoy proxy. I would love to chat with you about your use case and understand the requirements for your application to participate in the mesh.

Please see this link here to schedule a 30 minute time to chat with me about this issue: https://calendly.com/dyu-hashicorp/30min. I'm based in PST time but I can adjust my time to meet as well on a case by case basis!

brandon-johnson-globality commented 3 years ago

@david-yu My use case is pretty straight forward. I have a grpc service that I am trying to expose via the mesh. That grpc service is running on 8081 and the health checks, metrics, and logging endpoints are running on port 8080.

If set consul.hashicorp.com/connect-service-port: "8080" then my k8s liveliness and readiness probes will succeed but my grpc server is not part of the mesh and if I set consul.hashicorp.com/connect-service-port: "8081" then I have to remove my liveliness and readiness probes in order for things to work.

dekimsey commented 3 years ago

We are also very interested in this for our ECS Fargate workloads.

Our ECS service/task is composed of (say using jaeger-collector as an example):

1 consul agent
1 jaeger-collector
- grcp, used by jaeger-agents to submit traces
- http, zipkin format used for direct submissions (envoy)
- admin, metrics/management endpoint
1 consul envoy to represent "collector-grpc"
1 consul envoy to represent "collector-zipkin"

Of which one of these will have the "expose" flag set for the admin traffic. In Fargate, all containers in the task share the same network namespace, so this is just overhead at this point for us.

Each engineering service exposes several different Consul services. Since consul's concept of a service is more or less intrinsically linked with it's port, we register multiple services is Consul. Each connect envoy instance is tightly tied to it's respective Consul service. Which means N numbers of connect envoy sidecars. And that makes it hard to write TF around because I need to know in advance how many sidecars I'll need to add to my task definition.

mmDonuts commented 3 years ago

This is a bit of a blocker for our ability to move to Consul Connect as a service mesh. We have at least a few dependencies that expect to be able to connect over different ports in different contexts (RabbitMQ, NATS) as well as some internal sidecars that handle delegated responsibilities for the primary container in the pod. Is there a timeline or prioritization on this being implemented?

fouadsemaan commented 3 years ago

@banks We did manually create multiple proxies for each service port we opened on the pod. The issue is when we turn on ACLs. ACL token acquisition for a particular service uses the k8s auth method which relies on the service account name of which there could only be one for pod. More over the service account name must match the service name. This is where we are stuck. We're considering using an ACL token powerful enough to create individual ACL tokens for each of the services we are registering.

arothste-blk commented 3 years ago

we've spent the past 18 months doing creative things to work around this deficiency. when will HashiCorp step forward, recognize this gap, and provide a supported solution?

david-yu commented 3 years ago

Hi everyone, thank you for your feedback and details around your use case. We acknowledge and understand that folks are blocked here due to to the multi-port requirement for Service Mesh.

The Consul team is very much committed to finding a great solution to this and understand how many people need this support from our mesh. Our current position is that a great solution requires invasive changes to our catalog model and/or concept of service identity which is a lot of work and also likely backward incompatible. This makes it challenging to prioritize with other large changes also currently in flight. We are considering possible improvements that might be easier to make though still with UX tradeoffs.

whiskeysierra commented 3 years ago

I'd be happy if I could proxy multiple services with different names and ports using a single sidecar.

On Thu, 13 May 2021, 18:01 David Yu, @.***> wrote:

Hi everyone, thank you for your feedback and details around your use case. We acknowledge and understand that folks are blocked here due to to the multi-port requirement for Service Mesh.

The Consul team is very much committed to finding a great solution to this and understand how many people need this support from our mesh. Our current position is that a great solution requires invasive changes to our catalog model and/or concept of service identity which is a lot of work and also likely backward incompatible. This makes it challenging to prioritize with other large changes also currently in flight. We are considering possibly improvements that might be easier to make though still with UX tradeoffs.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hashicorp/consul/issues/5388#issuecomment-840659027, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADI7HPDYEH3WVETGEED7HDTNPZT5ANCNFSM4G2CRT5A .

fouadsemaan commented 3 years ago

@banks We did manually create multiple proxies for each service port we opened on the pod. The issue is when we turn on ACLs. ACL token acquisition for a particular service uses the k8s auth method which relies on the service account name of which there could only be one for pod. More over the service account name must match the service name. This is where we are stuck. We're considering using an ACL token powerful enough to create individual ACL tokens for each of the services we are registering.

We actually got around it by creating the service accounts associated to each service (port) and then mount their associated service account secrets to the corresponding container sidecar.

whiskeysierra commented 3 years ago

What's the status of this in the light of the recently released transparent proxy feature?

/cc @lamadome

krisiye commented 3 years ago

It's common for services to run a primary HTTP port or service port for traffic and an alternate port for management endpoints. For example, a spring boot service running HTTP traffic over 8080 and servicing management endpoints under 8089. At the moment there does not seem to be a good way to expose the management port internally with the consul mesh and will be good to support this.

Leandros commented 2 years ago

Any further status updates on this?

david-yu commented 2 years ago

Hi everyone, although we have wanted to get to this sooner, this is currently blocked due to some large architecture changes we are looking to make in our next major release. We will need to revisit this again after Consul 1.12 is released.

mister2d commented 2 years ago

We actually got around it by creating the service accounts associated to each service (port) and then mount their associated service account secrets to the corresponding container sidecar.

@fouadsemaan
Mind posting a gist somewhere as an example? Thanks.

david-yu commented 2 years ago

Hi @mister2d and everyone following along. We just released our initial support for multi-port Kubernetes services on Consul K8s 0.41.1. We have some documentation here that provides an example and also outlines caveats with this solution: https://www.consul.io/docs/k8s/connect#kubernetes-pods-with-multiple-ports.

As far as full multi-port support is concerned (to address the caveats as well as multi-port for other platforms), this is still an outstanding feature request and one that we'll need to consider for a future major release of Consul.

Poweranimal commented 2 years ago

@david-yu Thanks a lot for the info.

The doc states that transparent proxy is not supported for a multi port setup. Might that change any time soon or will this be part of the future major release of Consul?

Context: I'd like to use consul connect for services that are part of a ring setup.

This requires multiple ports to be exposed (check)
The ring key/value-store holds the ip addresses of the ring members. Each service must be able to reach another service directly via its ip (supported by transparent proxy with the "dialed directly"-option).

kolorful commented 2 years ago

Thanks for adding support for multi-port pod, however I find it's a bit odd to use in certain ways. Would you help me clarify if I'm wrong about them:

Multi-port mode is incompatible with transparent-proxy, would it be supported in the future?
- It would be pretty nature for apps to expose different service endpoints on multiple ports, e.g. one port for HTTP and one for gRPC.
Consul connect uses separate envoy sidecars for different ports/services.
- This adds certain complexity to resource reservation.
- Would you share the reason of the design choice?
- Is there any plan in the future to combine them into just one envoy sidecar
BUG: all envoy sidecars try to bind envoy admin endpoint on the same port 19000, hence only admin page of one envoy sidecar can be accessed.
Multi port services are not compatible with mesh metrics enabled
- It's odd that I have to turn off mesh metrics globally to pass the webhook validation, instead of just simply not emitting metrics for multi-port pods.
- Would metrics support be added for mutli-port pods.
Multi-port registers separate consul services for each port/service defined in the annotation.
- I wonder is there a reason we cannot use the same service name for different ports? This seems inconvenient.
- This will also increase our cost because we pay for number of services registered.

Appreciate the consul team making connect better and looking forward to seeing more improvements and using more features in the future. Thanks.

darkn3rd commented 2 years ago

@david-yu This is a serious blocker for @dgraph-io, a distributed graph database communicates through both 8080 (HTTP) and 9080 (gRPC). Administration and GraphQL are through 8080 exclusively, while DQL (Dgraph Query Language) can go through either 8080 (HTTP) or 9080 (gRPC) with gRPC being more popular for large databases. e.g. billions of predicates. Both ports are needed for core functionality of the database. I have successfully tested Dgraph with Linkerd, Istio, and NGINX Service Mesh. Consul with this limitation will be a non-starter for Dgraph customers.

romasi82 commented 1 year ago

Hi everyone, thank you for your feedback and details around your use case. We acknowledge and understand that folks are blocked here due to to the multi-port requirement for Service Mesh.

The Consul team is very much committed to finding a great solution to this and understand how many people need this support from our mesh. Our current position is that a great solution requires invasive changes to our catalog model and/or concept of service identity which is a lot of work and also likely backward incompatible. This makes it challenging to prioritize with other large changes also currently in flight. We are considering possible improvements that might be easier to make though still with UX tradeoffs.

Curious why adding support for multiple ports changes the concept of service identity here? Outside of maybe the k8s pod use case, the same service is just registering multiple ports. It is still a single service and the concepts of intentions should still apply. Adding the ability to further filter on which ports an upstream service has access to (e.g. not the admin port) would be a nice addition at some point as well, but maybe not an urgent requirement, and if the service doesn't have multiple ports then the syntax stays the same as it does right now. This might solve the k8s requirement as well since the concept of a service is really up the implementer. If they want the pod/container to run multiple processes and call it Service A then that should work similarly to running a single process with multiple ports.

iribeirolgc commented 1 year ago

+1

wand3r3r commented 1 year ago

This is kind of a big deal. Many COTS services expose multiple ports including Hashicorp's own Vault product. Others include RabbitMQ, Kafka, Opensearch, Elasticsearch, Graylog, CockroachDB, HAProxy stats page, Jupyterhub, pods with metrics scrapers, etc... So a big +1 here too. +1

david-yu commented 1 year ago

Thanks everyone for the continued feedback. We are prioritizing this work for this year and are looking to make changes within Consul to support multi-port traffic as well as catalog changes to handle multi-port traffic for services within Consul service mesh this year. This is going to be a large effort so it will likely take some time before see shipped in our releases. We are actively within the design stages to get this work underway.

alex-linx commented 1 year ago

@david-yu This limitation is also causing some issues for us. Thanks for providing the partial solution support, however, I followed the documentation and it still doesn't work for us. In our setup we have a pod with a single container that needs to expose 3 ports. Originally they were all defined as part of one kubernetes service, but we split them up, as per the ocumentation.

First of all , in the documentation, it says you need to create a secret for each service, I am pretty sure there is a typo

apiVersion: v1
kind: Secret
metadata:
  name: web
  annotations:
    kubernetes.io/service-account.name: web
  type: kubernetes.io/service-account-token

The type: kubernetes.io/service-account-token should be indented all the way to the left, like this

apiVersion: v1
kind: Secret
metadata:
  name: web
  annotations:
    kubernetes.io/service-account.name: web
type: kubernetes.io/service-account-token

otherwise the init containers are failing.

After fixing that, it still doesn't work. I tracked down the issue to be in the consul-dataplane sidecar containers. Only the first one of them starts successfully, the other two are failing with the following error: [ERROR] consul-dataplane: failed to start the dns proxy: error="failed to run the dns proxy: error listening for udp: listen udp 127.0.0.1:8600: bind: address already in use" I figured, the reason for that is all 3 sidecars are trying to start a dns proxy on port 8600. It probably works in the example case because there it has 2 different containers for each service in the pod, but in our case it fails because the services share the same container.

The port assignment should also be dynamic, to allow for such cases.

I know there is an argument for the dns port, do you know how I can change it? I am not creating those containers myself, they are being injected by consul inject, so I would need to somehow tap into that?

DUBANGARCIA commented 1 year ago

+1

david-yu commented 1 year ago

Hi everyone. Consul 1.17 now provides a preview of the experience for multi-port service mesh now which is documented here: https://developer.hashicorp.com/consul/docs/v1.17.x/k8s/multiport/configure. We currently just support K8s at the moment but more runtimes will follow in future releases. Please take a look and provide some feedback by filing new issues.

ilpianista commented 1 year ago

@david-yu great! Do you also plan to support consul connect metrics and metrics merging with this change?

porterctrlz commented 5 months ago

@david-yu Any progress on supporting this beyond k8s?

wahyugnc commented 2 weeks ago

Hi @david-yu any update transparent-proxy ?

hashicorp / consul

Support exposing multiple ports in a sidecar proxy. #5388

Feature Description

Use Case(s)