Closed ldarren closed 3 years ago
Thanks for the feature request; this sounds like a good fit for a plugin, but I see no reason why it needs to be in the main repository. (After 6 years, this is the first request we've had for it.)
Anyone is of course welcome to implement this as a plugin and register it to our website.
Honestly @mholt this is dead-easy to implement, I don't see why we don't just add it. Basically just doing additional go h.Transport.RoundTrip(req)
for each mirror and ignoring the results.
Other servers have this built-in, so I don't see the harm in adding it in Caddy as well.
I think it would be much more complex to implement in a plugin, because the plugin would have to re-implement parts of the reverseproxy module just to insert it in the right spot in the handler.
I can think of two ways to do this:
1) Like everyone else does, which is a prime candidate for a transport module. I think it would only need to wrap other transports and call their RoundTrip methods. 2) As an HTTP handler, which tees/splits the request handling chain into concurrent branches. One branch would continue down the request's original goroutine, while the others deal with copies of the request concurrently. This is probably more powerful than what other servers do as it allows you to do more concurrently than simply proxy to different backends.
I also do not think doing this correctly as "dead-easy to implement" as you suggest, since concurrency is involved. We would have to be careful to avoid leaking resources, for example.
I was thinking just like this, on the handler:
reverse_proxy <primary_upstream> {
mirror <mirror_upstreams...>
}
Then in reverseproxy.go
h.reverseProxy()
:
res, err := h.Transport.RoundTrip(req)
for _, mirror := range h.Mirror {
reqCopy := req.Copy() // however this is done
go h.Transport.RoundTrip(reqCopy)
}
But you have a better sense for the concurrent parts :man_shrugging:
Yikes, that'll leak goroutines like crazy, especially if a backend is slow or the network is having trouble or something.
Suppose requests come in at 1,000 per second. And each one spawns 2 goroutines for mirroring. That's 3,000 goroutines per second. Ok, but what if those goroutines each take 5 seconds to run. Now you've got, what, 5x as many goroutines, so like 15,000? :grimacing: (This assumes they do terminate at all.)
FWIW, that code is basically option 1 I suggested above.
Ok, but what if those goroutines each take 5 seconds to run. Now you've got, what, 5x as many goroutines, so like 15,000?
I'd put like an aggressive default 2s timeout on them. We don't really care to wait for a response, just fire it off and forget it.
FWIW, that code is basically option 1 I suggested above.
Not really, the code I wrote above suggests building it into the handler layer, not the transport layer. But it could be in a wrapped transport, sure.
Just for sake of reference, Traefik's impl: https://github.com/traefik/traefik/blob/master/pkg/server/service/loadbalancer/mirror/mirror.go. They use a sync.Pool
for this which would obviously help cap it to more reasonable amounts. Can just drop requests if it's going too fast for the pool.
But sure, I realize now this can probably be done in a transport plugin, probably without too much difficulty.
I do like the split
idea though, we could revisit that at some point.
Related discussions from https://caddy.community/
(Edited yes, removed that comment, I'm sorry for being the asshole of the day)
@joshxyzhimself consider sponsoring Caddy, and Matt may be able to prioritize implementing this (probably as a plugin initially).
Yeah, I can probably work on this with a sufficiently-tiered sponsorship (let me know if you're interested in that) -- it's just not a super high priority right now. But I could review a PR!
Plus, I don't feel particularly motivated to implement a feature if people are just going to use another product instead :upside_down_face:
Updated my comment above, sorry for being the asshole of the day. Totally unnecessary, my bad.
@joshxyzhimself Oh no worries, I don't view it like that, I just wasn't sure whether the effort would actually be used or not.
But yeah, this'll take some effort -- and I'm happy to do it -- just not a priority right now unless a company wants to sponsor it. :smiley: (Or I could review a PR for free!)
👋 hello! Sorry to bring up an old (and closed) issue, but I find myself in need of this feature.
First off, I want to say thanks so much for caddy - it's so simple to get started and the performance is outstanding. Additionally, it's the only reverse proxy I could find which properly load balances across multiple IPs returned from a single A record (dynamic upstreams) while also supporting keepalives ⭐
I am thinking of taking a stab at implementing a mirroring feature but wanted to check if you still think it's better off as a plugin or if it would be a good contribution to the main repo?
Depending on your needs, you could use forward_auth
which works somewhat similarly. Sends the request URL & headers as-is to an upstream as a GET with no body. (In fact, it's just a shortcut for a longer reverse_proxy
config)
If you need to send the body to all servers though, that's harder because the body is a stream, so you'd need to either buffer the request body or tee it somehow.
It probably doesn't need to be part of reverse_proxy
, I don't think it needs to be tightly integrated with it, unless I misunderstand the purpose. So yeah a plugin probably makes sense, we'd rather not need to maintain more features if we can get away with it.
Thanks for the background @pfrybar !
I agree a plugin approach would be a good start.
I'd implement it as a Transport plugin, it could embed/use the HTTP transport but just add a hopefully-thin layer that mirrors/shadows as configured.
Cool, sounds great. Thanks for the quick response! In my case the body needs to be sent to all servers as well.
I'll give it a try as a plugin. Makes sense to try and limit the number of features you need to maintain :)
Happy to be an alpha tester; I'm in need of this feature as well.
A mirroring / shadowing feature for caddyserver will be very useful for testing new code and troubleshooting
references: http://nginx.org/en/docs/http/ngx_http_mirror_module.html https://doc.traefik.io/traefik/routing/services/#mirroring-service https://istio.io/latest/docs/tasks/traffic-management/mirroring/