traefik / mesh

Traefik Mesh - Simpler Service Mesh
https://traefik.io/traefik-mesh
Apache License 2.0
2.03k stars 141 forks source link

Support for Custom Plugins with Mesh #841

Open alehechka opened 1 year ago

alehechka commented 1 year ago

Welcome!

Proposal

Currently, the base Traefik application supports plugins for both providers and middleware, allowing external developers to create and add custom functionality based on their unique use case. I am proposing a similar implementation to plugins but specifically for the Traefik Service Mesh. Ideally, this will be implemented in a similar format so developers can build with the context of a given request and provide custom rules or routing to the request before transmission. In the background section, I will include the use case I am trying to solve for which building a custom service mesh plugin could solve.

Background

In my Kubernetes setup, I utilize feature branch namespaces and ExternalName services to create easily deployable and small-footprint environments that can access the entirety of the microservice architecture. The general pattern is that when a pull request is opened, a namespace is created with specific labels, ExternalName Services that point back to the core Namespace Services, and IngressRoutes are duplicated into the new namespace with updated hosts for rules matching.

As an example, the core of the app's traffic goes to example.com hosted in the default namespace. A feature branch is spun up with the Namespace name of testing and all duplicated IngressRoutes now accept traffic to the testing.example.com hostname. The specific application that created this Namespace will be deployed, however, all other microservices will serve traffic from that original default namespace. External requests to the testing.example.com hostname will be serviced by hitting the IngressRoute and then the ExternalName Service which points to the default Service for each application. Additionally, any internal traffic will be routed via these ExternalNames.

The problem I run into and would like to be able to solve with Mesh plugins is the case where an external request to testing.example.com for service-one is routed via ExternalName back to the default branch, but service-one needs to make a call to service-two which spun up the new testing namespace. In this scenario, service-one will call via Kubernetes DNS to service-two, but since both exist in the default namespace, it will be serviced by the default version of service-two. Instead, I would like a custom plugin to examine the domain with configured rules during installation and see that the hostname testing.example.com matches the regex of *.example.com so I can assume this request was intended for the testing namespace and should attempt to make the request there instead. This way I can test out how other services will interact with any changes I am trying to make.

Workarounds

The workaround for my long-winded problem above is to simply deploy the service-one application into the testing namespace and hit it directly from testing.example.com. Since both applications would exist in the same namespace, the request from service-one would hit service-two in the testing namespace. For this reason, I am not suggesting my use case be added as a general feature to the mesh, but instead requesting plugins so that I can build a custom plugin for my specific use case. I can imagine that other developers would also be able to make great use of custom plugins to build on top of the already great service mesh that Traefik has provided.

tspearconquest commented 9 months ago

We have another use-case for this.

We currently use the Traefik Ingress Controller's modsecurity plugin to route client to service requests through our WAF after passing through Traefik, before being sent on to the service. In the future, we would like to have our WAF also inspect service to service traffic within our cluster.

We could do this with ExternalName services and Ingress Routes as OP has done for their own separate use-case, but it would be simpler if we could address this transparently by loading the same plugin, or one similar to it but designed for the mesh, into the mesh itself. The goal is to have all requests which get routed through the service mesh are transparently routed through the WAF, where we can audit them for indicators of compromise/attack, OWASP Top Ten risks, etc, and put enforcing mode policies in place to block malicious inter-service requests in case a service becomes compromised and the attacker starts trying to move laterally in the cluster to other services.