frigus02 / opentelemetry-application-insights

OpenTelemetry exporter for Azure Application Insights
MIT License
22 stars 12 forks source link

gzip compression for requests #47

Closed frigus02 closed 2 years ago

frigus02 commented 2 years ago

See #46

Seems to work correctly. Tested using:

Todo:

frigus02 commented 2 years ago

I tried to see if this improves things. Here are some numbers.

First observation: even tiny requests take a few hundred milliseconds. It seems to help a lot to increase the max export batch size. For the tests I used a super large max batch size to really see the effect of the compression:

export INSTRUMENTATION_KEY=???
export NUM_ROOT_SPANS=10000
export OTEL_BSP_MAX_QUEUE_SIZE=100000
export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=20000
cargo run --example stress_test
Test Serialize (JSON + GZIP) Send Sent body size Total runtime
With compression (level 1) 2.677s 6.908s 1089.28 KiB 12.000s
With compression (level 6) 3.454s 7.655s 784.73 KiB 12.525s
Without compression 0.604s 8.130s 13158.34 KiB 10.102s

Compression clearly reduces the body size by a lot (x16). However time spent during compression is bigger than the network time savings for me.

This doesn't seem like a clear win to me. So maybe this has to be opt-in for now.

Side note: There is another possible improvement. Currently serialization & send happen sequentially. It could be helpful to use streams so that the HTTP request can already start while data is being serialized. To make this work, opentelemetry-http would have to support that.

birkmose commented 2 years ago

Hmm this is interesting. Using the compress_file.rs example in the flate2 repo on my machine it takes 39ms to compress 2MB of JSON data. I did a compression test of 13MB data as well, and that took 238ms. Your example at the lowest compression level appears to take an order of magnitude longer? My tests are run on an AMD 3950X cpu for reference.

frigus02 commented 2 years ago

That is interesting. I'm going to test the flate2 example directly later to compare. Maybe I'm doing something stupid here.

frigus02 commented 2 years ago

Using compress_file.rs I get:

Mode Source len Target len Elapsed
debug 13474859 785183 1.270050744s
release 13474859 785183 104.022545ms

I realised I didn't run the tests with --release before. I also noticed it's faster to serialize JSON into a Vec first before passing it through the gzip encoder. Not sure why that is. Either way, here are new numbers:

Test Serialize (JSON + GZIP) Send Total runtime
With compression (level 6) 153ms 6.986s 7.832
Without compression 41ms 8.916s 9.636s

This looks much more like what I expected.

frigus02 commented 2 years ago

That looks good to me. I'm going to make a new release with this.