Closed bratfizyk closed 3 years ago
So definitely such thing does not exist (a stream which can read both formats).
I was working on assumption that you (the user) can tell them apart yourself (like ProtocolVersion
field in database, or different extension in filename).
First 4 bytes of NEW stream is a magic number '0x184D2204' maybe this might help? (see: https://github.com/MiloszKrajewski/K4os.Compression.LZ4/blob/f9e70f19d46ce5cec2ef858475129c648f704680/src/K4os.Compression.LZ4.Streams/LZ4DecoderStream.async.cs#L51)
if magic number is not there, than InvalidDataException is thrown (see: https://github.com/MiloszKrajewski/K4os.Compression.LZ4/blob/f9e70f19d46ce5cec2ef858475129c648f704680/src/K4os.Compression.LZ4.Streams/LZ4DecoderStream.cs#L93)
I know this is not ideal, as you would still need to open stream, read 4 bytes, and open stream again, but that's all I can offer you at the moment.
Ok, thanks for the feedback. Closing the issue as you said:
I was working on assumption that you (the user) can tell them apart yourself
Actually, @MiloszKrajewski there's one more thing I'd like to know.
In the old LZ4.NET library we used to have Wrap
and Unwrap
functions that accept byte arrays and compress/decompress them returning another byte array.
I see these methods in this repository as well in Legacy
module. Is there any other method that does the same thing outside of Legacy
. I found LZ4Pickle
class that has functions with signatures byte[] -> byte[]
. However, when using the Pickle
method, the output byte array doesn't begin with the MagicNumber
, which suggests it doesn't do the same thing as LZ4Stream
.
Most likely I'm missing a single piece here in order to understand everything :).
So Pickle/Unpickle has the same purpose as Wrap/Unwrap (thus same signature) but is not compatible, I was not planning backwards compatibility so Pickle/Unpickle does not any magic number. It is forward compatible (I reserved 3 bits for version) but not backwards.
You can use Legacy assembly one to read old ones (Unwrap) and write them in new format (Pickle), but knowing which ones are old/new is on you.
For example, I had a cache with lots of blobs packed with old LZ4. On migration I've just added new column (let's call it CompressionAlgorithm
at set it to 0
top to bottom).
Now every time new entry is written then I use Pickle
and CompressionAlgorithm
is stored as 1
. On read, I read CompressionAlgorithm
first and decide to use Unpickle or Unwrap depending if it is 0 or 1.
You could also use Pickle with IBufferWriter overload to do you own prefixing with magic number.
I do understand this is suboptimal and it would be much better if it was backwards compatible, but it isn't... Any legacy support was not even planned at first and added much later (see: #20)
Most likely I'm missing a single piece here in order to understand everything :).
Unfortunately you are asking very legitimate questions and I'm sorry that answers are most of the time: "you have to work around it yourself".
Thanks for responding once again. This all sounds reasonable and saves me a lot of time guessing. Much appreciated!
I found an easy way to implement Wrap
and Unwrap
using Streams, so if I need them, I know what to do, no worries.
Stream come with quite large overhead. It is fine if we are talking about megabytes of data, but for short messages pickle is much better.
Try code below. It will, of course depend on size of your messages but if they are below 64k it will be much much (much) quicker than stream:
public static byte[] MyPickleMagic =
BitConverter.GetBytes(0x13371234);
public static void MyPickle(
ReadOnlySpan<byte> source, IBufferWriter<byte> target)
{
target.Write(MyPickleMagic);
LZ4Pickler.Pickle(source, target);
}
public static void MyUnpickle(
ReadOnlySpan<byte> source, IBufferWriter<byte> target)
{
if (!source.SequenceEqual(MyPickleMagic))
throw new ArgumentException(
"Pickle magic does not match");
LZ4Pickler.Unpickle(
source.Slice(MyPickleMagic.Length), target);
}
Background
In my current system I use the deprecated lz4.net library. I'm planning to migrate to K4os.Compression.LZ4, but I already have hundreds of thousands files compressed using the old
LZ4Stream
. The files are scattered across many locations and I don't want to migrate them all at once to the new LZ4 Stream format.In an ideal world I'd like newly created files in my system to use the new Stream format, i.e.
K4os.Compression.LZ4.Streams.LZ4Stream.Encode
.Question
Is it possible to decode data in the following way:
K4os.Compression.LZ4.Streams.LZ4Stream.Encode
, useK4os.Compression.LZ4.Streams.LZ4Stream.Decode
.K4os.Compression.LZ4.Legacy.LZ4Legacy.Decode
?