caddyserver / replace-response

Caddy module that performs replacements in response bodies
Apache License 2.0
98 stars 27 forks source link

Support placeholders #3

Open mheyman opened 3 years ago

mheyman commented 3 years ago

Currently, placeholders are not supported in the search or replacement text. It would be really nice to be able to use them.

francislavoie commented 3 years ago

The problem with placeholders is that they're dynamic. The regexp are compiled ahead of time for performance. If they weren't, then it would be much slower.

While the replacements don't need to be compiled, they are prepared ahead of time to be passed to https://pkg.go.dev/golang.org/x/text/transform which performs the actual replacements (via https://github.com/icholy/replace which implements the regexp part). It might be possible to use replace.RegexpStringFunc instead of replace.RegexpString, but the tricky bit is getting the request into this function, @mholt might have ideas.

mholt commented 3 years ago

Yeah, so, we can add placeholders BUT, as Francis said, the values are compiled ahead of time (so we could only perform replacements on things like env variables and other, non-HTTP-request-specific variables), for performance reasons. Is that sufficient? What's your actual use case?

mheyman commented 3 years ago

My use case has to do with, among other things modifying javascript returned through a reverse proxy. It does require HTTP-request specific variables but no regex - although regex would help :-)

My case, there will probably never be more than 30 versions of a regex so lazy creation of the compiled regex and keeping a lru capped at 100 or 1000 of them would be more than enough for our code.

mholt commented 3 years ago

@mheyman To clarify, all replacements are processed at config load (startup), not at every request, whether using regular expressions or not:

https://github.com/caddyserver/replace-response/blob/9d5652c0256308fddaef1453d463d2a281498cb6/handler.go#L88-L99

Unless there's a clever solution I haven't thought of, it's likely that implementing this feature will cause a performance regression.

mheyman commented 3 years ago

I would hope there would be no performance regression because in the case where the information is available at config load, no change in ServeHTTP() is required. But, if using placeholders, a different (slower-running) ServeHTTPWithPlacehoders() stage could do the transform in a more flexible way - perhaps with an lru of compiled-as-needed regex (I'm not a go coder so I wouldn't presume to know the best way to go about this).

In our use case, we probably wouldn't notice the slowdown this alternative middleware would cause over the fast path because we only have a few users hitting the pages over an already-slow link. Also, like I mentioned earlier, there would only be 30ish different possible searches in our case so a reasonably sized LRU cache would easily hold all our possible searches. We can code these 30ish search strings manually in the current implementation but it means rewriting the config and restarting the reverse proxy when any of the 30ish proxied services change (happens often). Additionally, we may not always have insight into when things change except when things break. And we'd rather not run with that method of operation :-)

mholt commented 3 years ago

I guess that's technically possible, but it adds a lot of complexity.