Closed kiplingw closed 8 months ago
This is reasonable, so long as it can be disabled at runtime as well.
To disable at runtime user would do nothing. If they hadn't configured ResponseWriter
to do anything different, then its default behaviour would be to simply not compress what it's provided.
This sounds like a more then reasonable request, and I support its implementation. Still, I have heard multiple times that TLS + compression can create significant security issues, even though I am not sure how or why. We should definitely investigate - or, in case you already know why this can happen, please write it here so that I can better understand the issue :)
@Ne02ptzero, since you've implemented TLS support for pistache in the past, do you know what issues compression with encryption can create?
I did a bit of research this morning and it looks like the security issue arose when using integrated compression with TLS. But I don't think this is a problem for us since we're proposing to handle it at the Pistache library layer (although I'll do more research).
It also looks like we already disable compression in OpenSSL, even if the user requested it to mitigate BREACH and CRIME vulnerabilities.
Alright, so this shouldn't create issues for us since we'd be using compression on top of TLS, right?
We'd be compressing the data before it was fed into OpenSSL, so yes.
My PR is only a partial fix. But I realized belatedly that it's also partially broken.
If the client requests multiple different compression schemes in Accept-Encoding, like gzip, br, deflate, and perhaps even with q-factors, the patch as is expects but one compression scheme (at most) and no q-factor.
I was looking at what we could recycle that parses those kinds of headers. The closest I could find is the MediaType
class but it's specific to media types even though it can almost parse the Accept-Encoding format correctly.
I'm a bit surprised that the only header that implements multiple values with q-factors is that header, despite there being many that a client could submit.
My PR is only a partial fix. But I realized belatedly that it's also partially broken.
If the client requests multiple different compression schemes in Accept-Encoding, like gzip, br, deflate, and perhaps even with q-factors, the patch as is expects but one compression scheme (at most) and no q-factor.
Thanks for noticing. This isn't an issue in practice yet since we only support one encoding (deflate/zlib), but we should definitely fix this.
I was looking at what we could recycle that parses those kinds of headers. The closest I could find is the
MediaType
class but it's specific to media types even though it can almost parse the Accept-Encoding format correctly.
I've submitted a patch before about qvalues, so I've already had a look at that part of the codebase. I'll look into this while adding Brotli support.
I'm a bit surprised that the only header that implements multiple values with q-factors is that header, despite there being many that a client could submit.
I'm not sure I get what you mean, but qvalues are used with the Accept
header, among others.
Thanks for noticing. This isn't an issue in practice yet since we only support one encoding (deflate/zlib), but we should definitely fix this.
Well true that we only support deflate, but the point is the client could have listed any number of encodings we don't support - in addition to deflate. As it is right now Pistache won't be able to select the only one that it does.
I've submitted a patch before about qvalues, so I've already had a look at that part of the codebase. I'll look into this while adding Brotli support.
Good idea.
I'm a bit surprised that the only header that implements multiple values with q-factors is that header, despite there being many that a client could submit.
I'm not sure I get what you mean, but qvalues are used with the
Accept
header, among others.
Oh I agree. What I'm saying is that I'm surprised that in Pistache Accept is the only one that supports it despite, as you pointed out, many headers use that format.
I've been looking a bit at this, and I think that your patch itself is "OK". The newly added code doesn't have issues with q-values and multiple compression schemes, because it simply doesn't check for them: it leaves the job to the user.
Since you've only added a setCompression()
method it is the user who decides if and when to enable compression (like showed in the ContentEncodingHandler::onRequest()
function used in the new test), so, with the current API, it is up to the user to parse q-values in the onRequest()
function and decide which compression algorithm to enable.
What I think would make sense is to add a helper function similar to this:
Http::Header::Encoding getBestEncoding(const Http::Request& request) {
/* get client's preferred encodings, sorted by preference, accounting for q-values */
const std::array<Http::Header::Encoding, N> preferredEncodings = getPreferredEncodings(request);
/* return the first encoding that we support */
for (const auto encoding : preferredEncodings) {
if (supported(encoding)) {
return encoding;
}
}
}
User code would then call writer.setCompression(getBestEncoding(request))
in onRequest()
, which would set the proper content encoding.
Yes, I think that's almost the way to implement it @Tachi107. But I would suggest renaming getBestEncoding
to getBestAcceptEncoding
and getPreferredEncodings
to getAllAcceptEncodings
so that it's more clear that we're talking about the client's requested encodings.
Do you want to give it a go?
I'm also thinking that this should be made more generic because the q-value mechanism is used in HTTP headers in more places than just the Accept-Encoding
header.
And one other thing that occurred to me this morning, to ensure we don't break user code. If none of the user's requested encodings are available (e.g. such as all codecs disabled at configure time), then it should fall back safely to the identity
encoding scheme as explained here.
Hello everyone,
I'd like to put a feature request in as a starting point for discussion before I spend time implementing it.
Currently many clients, including Python's request library, by default send an Accept-Encoding header requesting at least
gzip
anddeflate
(the two most common codecs). The idea of clients expecting the server to be capable of sending a compressed response is so common it would not make sense for us to make every Pistache application go and re-implement the necessary changes when this could be implemented directly in the library.What I was thinking was the following:
ResponseWriter
to ensure whatever user passes itssend()
method is automatically compressed;Thoughts?