akka / akka-http

The Streaming-first HTTP server/module of Akka
https://doc.akka.io/docs/akka-http
Other
1.34k stars 594 forks source link

Remove Deflate/GZip stream stages and use ones from akka-stream #2916

Open jrudolph opened 4 years ago

jrudolph commented 4 years ago

We might be able to remove the decompressor side but the compressor side still is public API right now.

jrudolph commented 4 years ago

At least make sure that all non-intended to be public classes be marked as deprecated / internal etc.

krestenkrab commented 4 years ago

FYI, we (humio) have a rewritten GZip stream stage for http using com.jcraft.jzlib because the JVM native zip implementation ends up holding up the garbage collector all the time (it uses the JNIEnv:: GetPrimitiveArrayCritical that ends up blocking global GC). Let us know if you want to incorporate it.

jrudolph commented 4 years ago

Thanks for the great comment, @krestenkrab. I didn't know about it. I guess it would mean that if you have code that uses Gzip a lot, your parallelism might suffer when you are also allocation bottlenecked. I guess that's not an unlikely situation as you will use Gzip often when you have lots of data (e.g. parsing big files). In fact, in some of my personal projects I saw a few cases of inexplicably low CPU saturation during streaming parsing that might be related to this.

Another thing that would be enabled by using jzlib would be the WebSocket compression extension support (#659) where some gzip limits need to be configured that aren't exposed by the JDK.

So, initial support for this would be very welcome and would ideally go into akka-stream directly. It seems jzlib has a Deflate-like interface, so it might almost be a drop-in replacement for the JDK? If there's no good reason to keep the JDK implementation it would then be easy. Otherwise, we'd have to build an interface to support both JDK and jzlib at the same time.

Do you know of any performance benchmarks? (I would expect similar performance, low-level JVM translation might be slightly slower (in some cases) but JNI has it's own overhead).

jrudolph commented 4 years ago

@raboof did you get anywhere with this already? I also started a branch a few months ago. The basic blocker I found, is that the implementation classes Compressor and Decompressor classes are public right now. The other thing is that Encoder has strict methods that cannot be implemented by a streams version without blocking.

One suggestion would be:

The more conservative approach to the first step would be to deprecate and keep the implementation classes for another version.

jrudolph commented 4 years ago

Another feature we have in the Deflate decompressor which might be missing in upstream is that it can probe for the zlib wrapping. We would have to port that feature to the akka-streams version.

raboof commented 4 years ago

@raboof did you get anywhere with this already?

I don't think so ;)

raboof commented 4 years ago

The more conservative approach to the first step would be to deprecate and keep the implementation classes for another version.

I think I'd be fine with that...

jrudolph commented 4 years ago

3262 deprecates / internalizes more parts of the infrastructure. As we cannot remove most of the infrastructure any way and there's no suitable replacement for deflate, maybe that's as much as we can do for 10.2.x.

jrudolph commented 4 years ago

3294 deprecates even more, so that no implementations should be exposed from public non-deprecated API any more.

This will allow us to move on in the next release. #3293 is an example how we could already use Gzip from akka-stream. Deflate from akka-stream is missing the feature to detect the wrapping method, so that would have to be ported to akka-stream first.

It probably doesn't make sense to pursue this goal further with 10.2.0 as it would be confusing to use akka-stream implementations by default but still had to keep the old implementations around for compatibility reasons. That would only make sense when it would enable new features.