traefik / traefik

The Cloud Native Application Proxy
https://traefik.io
MIT License
51.07k stars 5.08k forks source link

enable custom plugins/middlewares for Traefik #1336

Closed migueleliasweb closed 4 years ago

migueleliasweb commented 7 years ago

After seeing the Go1.8 new plugin feature I though that this could help a lot o people to add specific functionalities to Traefik.

Instead of building/compiling/shipping a custom-made version of Traefik to enable a custom functionality it would be possible to write way simpler custom-made middlewares with this approach, doesn't it ?

Try imagine creating a package that receives the request at a parameter without having to recompile the whole Traefik repository just to add a small change. Does it sounds like a middleware ? Because for me it is ! It's just a go1.8-plugin-based-middleware !

What do you guys think ?

timoreimann commented 7 years ago

@migueleliasweb thanks for the suggestion. That does sound like a pretty interesting idea.

I haven't worked with Go's new plugin framework yet, so I can't judge on what it's like to work with it from a practical point of view. One concern I'm seeing is that right now, plugins are only supported on Linux. While this should probably not be too much of an issue when running Traefik in production (most people supposedly use Linux or Docker anyway), it's going to make development on any other platform much harder. You'd have to revert to tricks like running your builds off of a Docker container with the source code being bind-mounted in.

I'm not saying that we shouldn't do it because of these restrictions, but that we have to make sure the development workflow is acceptably usable for people not running Linux natively.

migueleliasweb commented 7 years ago

Hey @timoreimann ! I'm aware of the possible limitation since the Go plugin feature is still in it's early phase of development but soon I hope this feature will be released for all platforms and architectures.

Just keep in mind this is a cool feature and this could lead a much faster development of modules and extra features for traefik from the community ;D.

timoreimann commented 7 years ago

Full ACK, the potential is pretty huge.

sandstrom commented 6 years ago

Plugins to transform request/response would be great! The ability to adjust headers (add/remove/edit) alone would go a long way.

prologic commented 6 years ago

Is anybody working on this? Do we have a set of requirements?

jntakpe commented 6 years ago

At my company we are considering several API gateways and this feature is a must have

isontheline commented 6 years ago

I agree with everybody : it will be a killer feature!

tscibilia commented 5 years ago

I would love to see some development of a WAF or fail2ban or some sort of DOS security from attackers.

SuperSandro2000 commented 5 years ago

@tscibilia you can always install fail2ban on the host machine

letian0805 commented 5 years ago

I am using caddy because it supports custom plugins. But I really like some of the features provided by traefik, it would be better if traefik could support custom plugins.

canselcik commented 5 years ago

This would be huge. I am thinking something between Apache Traffic Server and Proxygen but in Go.

zlepper commented 5 years ago

I'm willing to take a shot doing a transform request/response middleware, if there is some sort of requirements for it? I assume with the new middleware infrastructure, it should be relatively easy to do.

Mostly i would assume what can be done in Microsoft IIS URL rewrite would be enough: Replacing using regexes, and their match groups. And considering that Go already has pretty good support for regexes, that should be relatively trivial.

prologic commented 5 years ago

I'm willing to take a shot doing a transform request/response middleware, if there is some sort of requirements for it? I assume with the new middleware infrastructure, it should be relatively easy to do.

Mostly i would assume what can be done in Microsoft IIS URL rewrite would be enough: Replacing using regexes, and their match groups. And considering that Go already has pretty good support for regexes, that should be relatively trivial.

As far as I'm concerned one just needs to use the Go plugin support that was introduced in the compiler some time go. Plugins then should implement the Middleware interface that already exists in Traefik.

The hard part is actually loading the plugins, discovering them and calling them.

emilevauge commented 5 years ago

@prologic FYI, we have been exploring this way for a long time (https://github.com/containous/traefik/pull/1370, then https://github.com/containous/traefik/pull/1865), and we concluded that plugins support introduced in Go was not ready yet. And sadly, it will probably never be. But we have been working on something else in the meantime. Stay tuned ;)

e-nikolov commented 5 years ago

Native Go plugins have a lot of gotchas at the moment. We use them at work and had to come up with a whole lot of workarounds and our own toolchain around them.

Just to list a few issues:

  1. Plugins don't work on windows (perhaps doesn't matter since you can use them in a docker container)
  2. The base application and all plugins need to be compiled with the same version of Go
  3. If the plugins and the base have shared dependencies, they need to be compiled with the exact same version of said dependency.
  4. https://github.com/golang/go/issues/18827 - if both the base and the plugin vendor a shared dependency, those will be considered as different packages because for Go, those two packages are different: $GOPATH/path-to-base/vendor/path-to-package $GOPATH/path-to-plugin/vendor/path-to-package

Points 3. and 4. have two implications:

Additionally, it is still unclear to me if the introduction of go modules would fix some of those issues or make them worse (mostly because I've been too scared to test).

thewilli commented 5 years ago

@e-nikolov so what's the problem? There are few constraints when building. Just create a Dockerfile and build the plugins alongside Traefik. This worked well for nginx for a really long time (modules needed to be built alongside nginx) - and a Go build is way easier than doing the same stuff for nginx..

Another option could be to use binaries as plugins / middlewares, and communicate over stdin/stdout. Would even work on Windows, but I am not sure about performance, especially when it comes to concurrent invocations.

emilevauge commented 5 years ago

@thewilli It's not that simple, trust me, we tried everything with go plugins. That's exactly what we started in https://github.com/containous/traefik/pull/1865. But at the end of the day, there were too many restrictions and really, you had to be lucky to make it work 😂 Using multiple binaries has also been explored, but we found in our tests that there were too much overhead in performances. In middleware plugins, you cannot afford inter-process communications at each LB request.

thewilli commented 5 years ago

@emilevauge thanks for the answer, it's really a shame...

I was thinking about the (early) nginx module system again. What about this minimal approach: Developers may create custom plugins / middlewares as Go packages, specified by a pre-defined interface. There is a configuration file where users may provide a list of Go package names corresponding to these plugin packages. A generator invoked by go generate or a Makefile generates some glue code that imports to and registers those plugins.

This would be far from perfect, but as you nevertheless seem to tend against external plugins (latency), it might be a first step to enable plugins for those who are not afraid to build their own fat Traefik binary but don't want to modify forked Traefik code itself either. And if you come up with a brilliant solution to support dynamic, external plugins you'd just replace this.

prologic commented 5 years ago

@thewilli It's not that simple, trust me, we tried everything with go plugins. That's exactly what we started in #1865. But at the end of the day, there were too many restrictions and really, you had to be lucky to make it work 😂

Would you be so kind as to enumerate all the problems you encountered? It might help to understand what they are exactly and if there are any technical solutions that can be addressed. I find it hard to believe the Go plugin support is this bad honestly (but I've only had tiny bit of experience with this so far in Go).

Using multiple binaries has also been explored, but we found in our tests that there were too much overhead in performances. In middleware plugins, you cannot afford inter-process communications at each LB request.

Yes! Any plugin system that traverses process boundaries and utilising any serialization of any kind will not perform well here. Out of the question :)

migueleliasweb commented 5 years ago

Hey guys/gals, it's so interesting to see this is still such a big topic! After all this time! More precisely, two years last week! :octocat: 🎉

Lately I've been working quite a bit with Kubernetes and I found an interesting pattern that Traefik could potentially implement instead of supporting Go plugins.

In Kubernetes you can define "MutationWebhooks", they work similarly to middlewares in most http frameworks but through HTTP instead of internally in the language. I wonder if Traefik could implement something similar...

It's basically an HTTP api that accepts the initial request parameters sent to the K8s API server and modifies them if needed. Then the api continues the request process flow as normal. It's also possible to cascade multiple mutation webhooks.

With this pattern, it would be possible to extend Traefik to add many features that can be easily shared between multiple servers at the same time.

Any thoughts? Cheers!

emilevauge commented 5 years ago

@prologic I'm not sure to understand your "thumb down" reaction:

Screenshot 2019-03-28 at 00 12 14

And the following message:

Would you be so kind as to enumerate all the problems you encountered? It might help to understand what they are exactly and if there are any technical solutions that can be addressed. I find it hard to believe the Go plugin support is this bad honestly (but I've only had tiny bit of experience with this so far in Go).

that seems to imply that either:

  1. there were not good technical reasons to not use go plugins
  2. I'm hiding things deliberately

It's not appropriate.

To clarify, this was something I have been working on almost 2 years ago. Of course, I don't remember everything. I really wanted to make it work, I spent a lot of time on this, and I was really disappointed to see that go plugins were not ready at that time. And to my knowledge, no real work has been done on plugins since then. And I'm still waiting to see another project using go plugins.

If you have a good reason to argue, you could probably help us submitting a pull request instead.

prologic commented 5 years ago

@emilevauge My words were not chosen tactfully enough. I apologise (having a bad time elsewhere!) -- I just wanted to understand what the problems are and if any of them have been resolved by Go's compiler toolchain or if they still continue to be problems. If you don't remember; that's fine but unfortunately -- It just means the next person that tries to code up the same or similar solution will run into the same problems.

emilevauge commented 5 years ago

@prologic thanks 🙂 ! I will try to find more details on this.

prologic commented 5 years ago

@prologic thanks 🙂 ! I will try to find more details on this.

Please! that would be super awesome! I love solving problems and as I said earliy (although badly) I've only had a small taste of using Go plugins so far. Understanding even some of the challenges you faced ~2 years ago would help either a) Not wasting time on it again or b) Seeing if some of them can be solved now (maybe Go has improved since?) -- Hard to say either way.

L3o-pold commented 5 years ago

It's maybe a little out of scope but that is a feature IMO a lot of people needs. I have literally no experience with GO and I successfully used your branch add-plugin-support so far. I feel bad to used a (very?) old branch but so far so good it seems to be working as expected.

I hope we will not be facing weird issue with it and that the community will find a solution to add this awesome feature to Traefik.

e-nikolov commented 5 years ago

@prologic I can't speak for the issues the Traefik maintainers have encountered, but I listed some of the ones I've experienced in here https://github.com/containous/traefik/issues/1336#issuecomment-474888814

Using Go plugins is definitely possible, because we do it at work, but we had to find workarounds for those issues and the end result is still quite hacky.

@thewilli

Short answer:

It isn't as simple as using a docker container for consistent builds, because you need to take into account the dependencies that are shared between both the plugins and the application that loads them.

We need to distinguish between 2 types of shared dependencies - accidentally shared and intentionally shared.

Due to the implications I mentioned in my previous post, in order for Go plugins to be loaded into Traefik, all accidentally shared dependencies need to be vendored, while all intentionally shared dependencies need to be put into the GOPATH at compile time.

Long answer:

  1. Accidentally shared dependencies:

If the acidentally shared dependencies are NOT vendored, Go will require them to be the exact same version (actually same hash of the package contents). If they aren't the exact same version, when loading the plugin you get: panic: plugin was built with a different version of package gopkg.in/yaml.v2. Having the same versions of the accidentally shared dependencies is not something that can easily be enforced, since

This "bug/feature" makes it so that Go plugins don't recognize that two copies of a package are the same if they are vendored in different plugins. This can be ab/used to allow loading plugins with a different version of an accidentally shared dependency because Go will not recognize that they are supposed to be the same package and will therefore not check if they are the same version.

This solution works in most cases, except when a package does something in its init() function that panics if it's done twice (e.g. https://github.com/golang/go/issues/24137). This will be a problem because the init() will be executed once for each plugin that has a vendored copy of it. In our project we solved this by replacing the http.DefaultServeMux before we load each plugin, which is super hacky and this approach isn't guaranteed to work for every instance of this problem.

  1. Intentionally shared dependencies:

Traefik is currently vendoring its dependencies, so if there needs to be some intentionally shared package in there, it would need to be pulled out into the GOPATH. In our project we tried several approaches to solve this one

Final words

There could be other and better solutions that we've missed and I'd be happy if they are pointed out, because right now plugins seem to be made only for the bravehearted.

thewilli commented 5 years ago

Okay, so it seems there is no perfect solution yet, and it takes time to find a solid one the majority agrees with. So let's try to find the least worst solution as temporary workound!

My proposal would be to have a builtin bridge Middleware that allows to invoke external Middlewares, maybe by using something similar to #2362 (gRPC). It was discarded because of latency, but (according to the comments) some seem to prefer having additional latency compared to no support for custom Middleware. This approach would not affect the users not interested in external MIddlewares, by still enabling this for the one who do.

What do you think?

negasus commented 5 years ago

Hi, guys) What about 'LuaScript'?

bellmsk commented 5 years ago

It'll be a powerful feature with LUA support.

aantono commented 5 years ago

I've done a number of tests on various embedded options, and unfortunately a vast majority of them have a very high latency. One that I found very promising (and actually working on a PR) is the Tengo interpreter. It is MUCH faster due to the fact that is is actually compiled (internally, like JIT in Java), so could be a viable option.

thewilli commented 5 years ago

One that I found very promising (and actually working on a PR) is the Tengo interpreter

that would be just another technology, and in addition lacks a standard library to fulfil common tasks. Just consider something as simple as JWT validation. It would be a hell to write this from scratch with that limited ootb functionality.

I'd propose to choose a technology with an existing ecosystem and implementations for common middleware tasks.

aantono commented 5 years ago

Perhaps... though in all honesty I have a hard time imagining someone actually doing the real JwT validation inside Lua code itself and not calling out to a real side service for this.

Also important to remember that embedded Lua interpreter (for Golang) would also be limited in support, so not all libs will be available.

In your JWT scenario what exact libraries (built in) are you missing?

joejulian commented 5 years ago

It seems to me that most interpreted languages are likely too slow for any significant scale and often present concurrency problems.

negasus commented 5 years ago

But it user choise - include or not lua script. In middleware not included, performanse not chane.

negasus commented 5 years ago

Tests with enabled and disabled lua middleware (on my macbook)

RUN TEST COMMAND WITH VEGETA:

echo "GET http://localhost/" | vegeta attack -rate 1000 -duration=10s | tee results.bin | vegeta report

LUA MIDDLEWARE ENABLED

negasus@negabook ~ ./vg.sh                                                                                                                                                                                                             ✔  10650  18:30:59
Requests      [total, rate]            10000, 1000.07
Duration      [total, attack, wait]    9.999516285s, 9.999309s, 207.285µs
Latencies     [mean, 50, 95, 99, max]  178.173µs, 168.034µs, 217.445µs, 384.782µs, 7.507275ms
Bytes In      [total, mean]            220000, 22.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  0.00%
Status Codes  [code:count]             422:10000
Error Set:
422 Unprocessable Entity

LUA MIDDLEWARE DISABLED

 negasus@negabook ~ ./vg.sh                                                                                                                                                                                                             ✔  10650  18:31:44
Requests      [total, rate]            10000, 1000.07
Duration      [total, attack, wait]    9.99952535s, 9.999282s, 243.35µs
Latencies     [mean, 50, 95, 99, max]  246.75µs, 234.658µs, 275.63µs, 401.723µs, 7.323593ms
Bytes In      [total, mean]            530000, 53.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:10000
Error Set:

For information - for LUA i use github.com/yuin/gopher-lua

aantono commented 5 years ago

What exactly did your middleware do, as it looks like the performance with middleware is better than without?! I think that can't be right just by the fact that middleware is an addon, so times should be either on par or higher, but not by an order of 50-60µs faster. ;)

negasus commented 5 years ago

I can put the full code somewhere so everyone can check)

I think is the fluctuation. Result can be considered that LUA JIT is quite fast.

aantono commented 5 years ago

I would not call it a fluke, as all the buckets have the same story (with plugin it runs faster). Does your plugin terminate the chain, or allows it to proceed to the backend?

I think a more indicative test would be to have a "blank" plugin that maybe only adds an extra header (like "X-Plugin-Applied:PluginName") and let's the rest of the chain to go along as it was... This we we can truly measure the performance cost.

It is also important to ensure the concurrent execution of multiple results without overstepping on each other. From what I've seen by looking at the Gopher-Lua, it is not concurrent-safe, so to make it safe one would have to instantiate and evaluate/interprete the code for EACH execution instead of doing it once, which would most likely kill the performance.

negasus commented 5 years ago

Oh yeah, oops, in this example the request is interrupted and 422 is returned. I use pool of LuaState for concurrent safe.

New Results: lus script:

local http = require("http")
http.setRequestHeader("X-Token-Validate", "42")
http.setResponseHeader("X-Token-Validate-Result", "42")

WITHOUT MIDDLEWARE

Requests      [total, rate]            10000, 1000.09
Duration      [total, attack, wait]    9.999335833s, 9.99909s, 245.833µs
Latencies     [mean, 50, 95, 99, max]  246.914µs, 234.051µs, 304.517µs, 438.429µs, 6.876799ms
Bytes In      [total, mean]            530000, 53.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:10000
Error Set:

WITH MIDDLEWARE

Requests      [total, rate]            10000, 1000.09
Duration      [total, attack, wait]    9.999368718s, 9.999148s, 220.718µs
Latencies     [mean, 50, 95, 99, max]  263.39µs, 245.086µs, 298.501µs, 500.663µs, 10.133925ms
Bytes In      [total, mean]            550000, 55.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:10000
Error Set:
migueleliasweb commented 5 years ago

Hey everyone! I just wanted to brain-dump a few things here.

The best part I see in using something like gRPC instead of internal & language-specific implementations like go plugins or embedded solutions is that many other languages could be used to extend Traefik. I can see people using for example Rust, Erlang or even JVM based languages to extend Traefik in ways we will never be able to foresee.

I went through the PR that @thewilli mentioned and apparently latency was a really big topic there. It's interesting to notice that was the first try for this feature. I'm sure we will be able to achieve better results.

Some key things:

  1. As mentioned quite a few times in that thread, it's all about "YMMV" kind of approach. Not having this feature is far worst than having an extra ~50ms for the majority of the cases. IMHO, if my application is so time sensitive that I can't bear an extra 50ms, I would be running a custom build, highly optimized Traefik instance anyway. For all other cases, I'd rather have a small extra latency for the sake of having more features available on my arsenal.

  2. Having the support gRPC plugins would potentially open many new avenues. Dynamic configuration of plugins, multi language support and so on.

  3. For me the most important thing about supporting external plugins is enabling people to create just the piece of software needed to allow them to use Traefik instead of creating a custom Traefik-based solution by changing the source code. This is specially good when upgrading Traefik!

canselcik commented 5 years ago

Personally I am not a fan of the external middleware support approach at all. At that point, why wouldn't one build a pipeline of proxies to achieve the same result with less back and forth?

That being said, Lua middleware with a good set of APIs would be quite awesome and versatile. Something like what Apache Traffic Server provides would be very desirable:

https://docs.trafficserver.apache.org/en/latest/admin-guide/plugins/lua.en.html

thewilli commented 5 years ago

why wouldn't one build a pipeline of proxies to achieve the same result with less back and forth?

Well, isn't that, what a Middleware is exactly about? IMHO the benefit of a Middlware compared to some external solution is that the surrounding solution provides some glue. Have you tried to configure Traefik (or any similar software) to route your traffic through some proxies? If you are able to do so easily, then you might as well close #878.

Again, what do we loose by choosing a generic solution? Provide a platform and let the users decide whether they want to write their Middlewares in Go, Lua, Java or Brainfuck. As for Java, I do know some how would want to integrate their existing Java-based codebase if they had a chance to do so.

negasus commented 5 years ago

Sorry, I could not resist and made an example for LuaScript) in this repo

santiagopoli commented 5 years ago

The Lua solution seems OK. Even if it adds a little latency, is an opt-in feature. I also think that is better to have a working solution rather than no solution at all, since a lot of people are asking for this feature. I have yet to test @Negasus solution, but It looks great!

Best case scenario would be to allow creating middlewares in both scripts (Lua) and webhooks (GRPC, HTTP).

bitsofinfo commented 5 years ago

def would like this for integrating w./ something like modsecurity

guilhermeaiolfi commented 5 years ago

When we talk about plugins/middlewares, will it be possible for me to create a middleware that query a database and based on that information decides which backend to use?

I would like to use traefik to determine which docker container to use depending on the client accessing. For example, client-1 would access application container v1.0 and client-2 would access v2.0, which way to go depending on the data stored in the database.

Is there a workaround for that scenario currently?

torarnv commented 5 years ago

Here's one use-case for a middleware that has a lua solution for ngnix: Working around Safari's lack of client certificate support for websockets. Hopefully it would be possible to do with traefik's future plugin system:

local HMAC_SECRET = "hunter2"
local crypto = require "crypto"

function ComputeHmac(msg, expires)
  return crypto.hmac.digest("sha256", string.format("%s%d", msg, expires), HMAC_SECRET)
end

verify_status = ngx.var.ssl_client_verify

if verify_status == "SUCCESS" then
  client = crypto.digest("sha256", ngx.var.ssl_client_cert)
  expires = ngx.time() + 3600

  ngx.header["Set-Cookie"] = {
    string.format("AccessToken=%s; path=/", ComputeHmac(client, expires)),
    string.format("ClientId=%s; path=/", client),
    string.format("AccessExpires=%d; path=/", expires)
  }
  return
elseif verify_status == "NONE" then
  client = ngx.var.cookie_ClientId
  client_hmac = ngx.var.cookie_AccessToken
  access_expires = ngx.var.cookie_AccessExpires

  if client ~= nil and client_hmac ~= nil and access_expires ~= nil then
    hmac = ComputeHmac(client, access_expires)

    if hmac ~= "" and hmac == client_hmac and tonumber(access_expires) > ngx.time() then
      return
    end
  end
end

ngx.exit(ngx.HTTP_FORBIDDEN)

https://blog.christophermullins.com/2017/04/30/securing-homeassistant-with-client-certificates

torarnv commented 5 years ago

But we have been working on something else in the meantime. Stay tuned ;)

@emilevauge Can you expand a bit on the plan you hinted at earlier? -^

Seems there are a few options:

andrewsav-bt commented 5 years ago

There is Hashicorp gRPC based plugin system here

ldez commented 5 years ago

@andrewsav-datacom https://github.com/containous/traefik/pull/2362