lz4 / lz4-java

LZ4 compression for Java
Apache License 2.0
1.1k stars 253 forks source link

Allows changing lz4hc compression level #43

Closed linnaea closed 9 years ago

linnaea commented 10 years ago

This pull request fixes issue #37.

The modified test case only checks for compression level 1, 5, 9 and 13, which I believe should be sufficient. Testing all levels up to 17 may take a long time. (Adding level 17 alone adds 1 minute to the overall test time on my machine, which has a Core i5-4200U CPU.)

To clarify, two macros in the lz4hc.c, LZ4HC_DEFAULT_COMPRESSIONLEVEL and MAX_COMPRESSION_LEVEL, are defined as 8 and 16 respectively, but when using these to determine the maximum search attempts, these two macros are used directly, while the level supplied by the user is first subtracted by one. So I wrote 8+1 and 16+1 in LZ4Constants.java.

DariusSki commented 10 years ago

I'd really love to see this merged into main :)

jpountz commented 9 years ago

I just left one comment about the caching of instances but other than that it looks good to me!

linnaea commented 9 years ago

I did some benchmark and it turns out that creating them on demand and then cache the result is slightly faster than creating all of them when constructing.

The result on my machine (E3-1230v2 with 20GB RAM, running Java 8 x84 on Windows 7) is:

On demand cache with HashMap: 2857 ms

On demand cache with AtomicReferenceArray(linnaea/lz4-java@4e77dbe): 2733 ms

Creating all of them beforehand(linnaea/lz4-java@4cb7243): 3441 ms

jpountz commented 9 years ago

I am not sure to understand what you are measuring here? Is it the time it takes to initialize everything? If so I think it's ok as it is a one-time cost? Something I like about this last option is that if there are issues with instantiating the compressor with some compression levels (I'd rather be careful with reflection), it would fail immediately as opposed as when the compressor is requested.

linnaea commented 9 years ago

The benchmarking code calls highCompressor with various compression level 100,000 times and measures the time taken to complete.

It may be better to make sure the calls never fail anyway.

jpountz commented 9 years ago

This looks good to me now, the only thing I would like to remove is the new LZ4HCCompressor abstract class in order to keep the API minimal. What do you think?

linnaea commented 9 years ago

LZ4HCCompressor removed.

It was there so that one can get the compression level actually in use and probably display it to the user. Now that the call to highCompressor is now guaranteed to never fail, it's probably redundant.

jpountz commented 9 years ago

I'm trying to merge the change but am getting some test failures. Looking into it...

jpountz commented 9 years ago

Found it, all is good now. Thanks @linnaea !