lz4 / lz4-java

LZ4 compression for Java
Apache License 2.0
1.1k stars 253 forks source link

Add basic LZ4 Frame v1.5.0 support #61

Closed drcrallen closed 7 years ago

drcrallen commented 9 years ago

This PR is to add support for most of the features of Frames 1.5.0 as found at https://docs.google.com/document/d/1cl8N1bmkTdIpPLtnlzbBSFAdUeyNo5fwfHbHU7VRNWY

This is a port of some Kafka code

This is technically ASF Apache license 2.0 code.

jpountz commented 9 years ago

Thanks! I only skimmed through the patch but it looks great.

jpountz commented 9 years ago

Out of curiosity, do you know if the Kafka people are aware of your efforts to contribute this framing format support upstream?

drcrallen commented 9 years ago

@jpountz Yes and no. I let the original contributor know about the patch and where to find it. But I do not have the capacity to test rolling upgrades from the prior version to this version, so I cannot immediately suggest a backport of the code to the main branch. I'm also only tangentially familiar with the way that the prior version was used in kafka in the wild.

drcrallen commented 9 years ago

There are still a few potential issues with this PR: like the lack of capacity to specify the compressor and decompressor. But I wasn't sure what your preferred way of handling it would be.

ederrf commented 8 years ago

Hi @jpountz, I started using today the code provided by @drcrallen and, for my needs, it is working as expected. I'll be running more tests along the following days. Are you planning on merging this PR anytime soon ?

jeroenvuurens commented 8 years ago

Thanks for providing an extension. I've also started using the code provided here, it works, and the output is compatible with the unix command line lz4 util.

hochgi commented 8 years ago

:+1: for merging this PR

drcrallen commented 8 years ago

@jpountz anything in particular you would like to see here before merging?

jjhart commented 8 years ago

@drcrallen - I just made a couple comments on the pull request that might explain jpountz's hesitancy to merge this PR (not to put words in his mouth, I have no idea. I just know they were problematic when I tried to test this PR on my machine).

drcrallen commented 8 years ago

for what its worth I fixed the mentioned items. Still not sure the best way to handle the change in https://github.com/Cyan4973/lz4/commit/f02adc79389732177dca6fa21a3e716249aa63dd since there's not a multiple-version-friendly flag

FergusNelson commented 8 years ago

I've used the LZ4FrameInputStream code to process some large (~150GB) files and can confirm that it worked very well.

ScalaWilliam commented 8 years ago

Would be very useful to have this. I was lost as to why I couldn't decompress a file I compressed with the reference implementation.

odaira commented 7 years ago

Thanks for the contribution!

I found a few problems, so I'll fix them in the next commit. I'll also revise the coding style and will add a license note.

odaira commented 7 years ago

@drcrallen Is there any reason LZ4FrameOutputStream.FrameInfo is a public class? As far as I read the code, it is used only internally in LZ4FrameOutputStream and LZ4FrameInputStream.

ScalaWilliam commented 7 years ago

Excellent!