Open jberryman opened 2 years ago
:thinking: Indeed, the time spend decompressing the body will be measured simply because the body is decompressed on-the-fly, while it's being read from the http.Response
's Body
(which is a io.ReadCloser
and likely a stream) by this function.
I am wondering though, this decompression should happen pretty quickly on modern machines :thinking: I see we are using a mixture of github.com/klauspost/compress, github.com/andybalholm/brotli, and the stdlib's compress/gzip
and compress/zlib
. So maybe we should switch everything we can to the very optimized klauspost/compress
library and run some benchmarks :thinking: https://github.com/grafana/k6/blob/3e88e18166d90ddf5ca395a6e07b47cb221255c0/lib/netext/httpext/compression.go#L151-L159
I also see you are using a pretty old k6 version (v0.34.0, released ~1y ago). I suggest updating to the latest one (v0.40.0, released a couple of weeks ago) and seeing if you notice any improvements. In general, on-the-fly decompression should be pretty unnoticeable if you haven't undersized the machine k6 will run on.
In any case, to get back on topic - changing how k6 does the http_req_duration
measurement will be both a breaking change (a new k6 version will start measuring the same things a bit differently) and will also have negative performance implications. We'll need more RAM for every compressed response, since we'll first need to read the compressed body from the wire, store it and then decompress it separately. Given both of these negative consequences (breaking change + performance regression), I don't think we'll ever merge something like this in the current k6/http
API.
That said, we plan to create a new k6 HTTP API (https://github.com/grafana/k6/issues/2461) that addresses some of the many issues with the current one, hopefully soon... :crossed_fingers: That's why I won't close this issue outright - we might figure out some way to address it and satisfy your use case without making performance or UX compromises.
The big problem I can see here is that k6 automatically decompresses the response bodies and there is no way to disable that behavior. If that part is configurable and can be disabled, then the measurement of http_req_duration
will be what you want. But, I assume you still need to access the uncompressed bodies, otherwise you'd have just used discardResponseBodies
? So, if we add a way to disable the automatic body decompression in the new HTTP API, then we'll also need to add an optimized API (i.e. Go code, not pure JS) for manually decompressing them afterwards. Which might not be a bad idea for other reasons, but it's still one more API we'll need to maintain, so more complex than just adding an option somewhere... :thinking:
Still, all of this is worth considering, so thank you for opening this issue! :bow:
on-the-fly decompression should be pretty unnoticeable
I'll just reiterate that this isn't the case for our API. We saw some test cases get what looked like 15% slower with Accept-Encoding: gzip
, microbenchmarks showed libdeflate
ought to be twice as fast as zlib for all our response bodies, yet k6 results just showed 2-3% improvement when migrating to libdeflate
Brief summary
It appears that k6 includes the time it takes to decompress a gzip’ed response body in
http_req_duration
, when notdiscardResponseBodies
.This caused me a lot of time debugging, as I was attempting to optimize our compressed code path and wasn't seeing the results from k6 align with my micro benchmarks.
The time to decompress a response body is useful though (one might be trying to optimize for page load time if serving html, e.g.), but otherwise I think this is a surprising default. Can I suggest including this separately?
loosely related: #2586
k6 version
0.34.0
OS
linux
Docker version and image (if applicable)
No response
Steps to reproduce the problem
Large request with
Accept-Encoding: gzip
, then try w/discardResponseBodies
Expected behaviour
Don't time body decompression or validation, or include it separately.
Actual behaviour
Decompression is included in reported latency.