zalando / logbook

An extensible Java library for HTTP request and response logging
MIT License
1.85k stars 260 forks source link

Response decompression in Apache Http Client 5 #1112

Open aleksey-petrov-github opened 3 years ago

aleksey-petrov-github commented 3 years ago

There is known issue (#135) : The Logbook HTTP Client integration is handling gzip-compressed response entities incorrectly if the interceptor runs before a decompressing interceptor. Since logging compressed contents is not really helpful it's advised to register the logbook interceptor as the last interceptor in the chain.

This advice is not suitable for Apache Http Client 5, because ResponseContentEncoding interceptor now is implemented as ContentCompressionExec handler which is called after interceptors. So the logged content is compressed and also body filters are broken because they test compressed content.

For my project workaround is to implement strategy, that wraps response with decompressing logic:

package a.b.c;

import org.apache.hc.client5.http.entity.DeflateInputStreamFactory;
import org.apache.hc.client5.http.entity.GZIPInputStreamFactory;
import org.apache.hc.client5.http.entity.InputStreamFactory;
import org.zalando.logbook.HttpRequest;
import org.zalando.logbook.HttpResponse;
import org.zalando.logbook.Strategy;

import java.io.IOException;
import java.util.Map;

public class ResponseDecompressingStrategy implements Strategy {

    private static final Map<String, InputStreamFactory> DECODER_REGISTRY = Map.of(
            "gzip", GZIPInputStreamFactory.getInstance(),
            "x-gzip", GZIPInputStreamFactory.getInstance(),
            "deflate", DeflateInputStreamFactory.getInstance());

    @Override
    public HttpResponse process(HttpRequest request, HttpResponse response) throws IOException {
        return new DecompressingHttpResponse(response.withBody(), DECODER_REGISTRY);
    }
}
package a.b.c;

import lombok.AllArgsConstructor;
import lombok.NonNull;
import org.apache.hc.client5.http.entity.InputStreamFactory;
import org.apache.hc.core5.http.HttpHeaders;
import org.zalando.logbook.ForwardingHttpResponse;
import org.zalando.logbook.HttpResponse;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.util.Collections;
import java.util.Map;
import java.util.Optional;

@AllArgsConstructor
public class DecompressingHttpResponse implements ForwardingHttpResponse {

    @NonNull
    private final HttpResponse response;

    @NonNull
    private final Map<String, InputStreamFactory> decoderRegistry;

    @Override
    public HttpResponse delegate() {
        return response;
    }

    @Override
    public byte[] getBody() throws IOException {
        return getDecompressedBody();
    }

    @Override
    public String getBodyAsString() throws IOException {
        return new String(getDecompressedBody(), getCharset());
    }

    private byte[] getDecompressedBody() throws IOException {
        byte[] bytes = response.getBody();

        if (bytes.length <= 0) {
            return bytes;
        }

        Optional<InputStreamFactory> inputStreamFactory = Optional.ofNullable(response.getHeaders().get(HttpHeaders.CONTENT_ENCODING))
                .orElseGet(Collections::emptyList)
                .stream()
                .filter(decoderRegistry::containsKey)
                .map(decoderRegistry::get)
                .findFirst();

        if (inputStreamFactory.isEmpty()) {
            return bytes;
        }

        return inputStreamFactory.get().create(new ByteArrayInputStream(bytes)).readAllBytes();
    }
}

It might be not suitable for any reason for others. And also it is not optimal because there is no information about response that helps to define that decompression is needed inside strategy - no access to body length and content encoding header at this step, so decision is made in DecompressingHttpResponse. But strategy is the only place to wrap before FilteredHttpResponse.

In my opinion there are 2 ways to handle decompression:

  1. Reimplement interceptor in logbook-httpclient5 as ExecChainHandler and add it last to exec chain, but seems it is not suitable place for logging, but affects only logbook-httpclient5 module.
  2. Allow Logbook creation with arbitrary decoder registry, which will be applied before response filters in DefaultLogbook and will wrap response with DecompressingHttpResponse passing appropriate decoder (concrete entry value of registry map). But it affects logbook-api and logbook-core modules.

Inline old ResponseContentEncoding interceptor and explicitly include it in common chain for Http Client 5 (like it done for OkHttp* #386) seems not proper way also.

Interesting to know your opinion first before any pull requests.

whiskeysierra commented 3 years ago

@aleksey-petrov-github Thank you very much for your detailed investigation! 📣

I'd be in favor of option 1, just because it wouldn't affect api and core modules.

SimplicialCycle commented 2 years ago

@whiskeysierra I hope you like the fix.

github-actions[bot] commented 1 year ago

In order to prioritize the support for Logbook, we would like to check whether the old issues are still relevant. This issue has not been updated for over six months.

mikhail-putilov commented 1 year ago

Hi :slightly_smiling_face:

I am currently using the latest Logbook and HttpClient5. I noticed that the fix from above works well but has one critical undocumented downside: An additional exec chain .addExecInterceptorFirst("Logbook", new LogbookHttpExecHandler(logbook)) indeed fixes the problem with compression but it does not log anything that is mutated by .addRequestInterceptorFirst(<any interceptor>).

This is important downside that must be at least documented in README, in my opinion. The usual way of building an HTTP client instance is to use HttpClientBuilder the following way:

HttpClients.custom()
                .setDefaultHeaders(List.of(new BasicHeader("name", "value")))
                .setUserAgent("asd")
                .setConnectionManager(...)
...

And in this case, neither user agent, nor default headers are logged by Logbook because of exactly that. setDefaultHeaders and setUserAgent invocations add addRequestInterceptorFirst to PROTOCOL chain inside of HttpClient.