jzebedee / deltaq

Fast and portable delta encoding for .NET in 100% safe, managed code.
Other
40 stars 13 forks source link

Is this a replacement for BsDiff.net? #1

Closed nbevans closed 9 years ago

nbevans commented 9 years ago

Looks like you gave up with those PR's on the BsDiff project and decided to fork it into deltaq under your own control. Fair play :)

So is this assumed to be the successor to BsDiff, with those annoying sorting bugs fixed, and other subtle improvements?

I am hooking this thing up to my Xamarin project right now, looking forward to cutting down my app's server<->client bandwidth usage very significantly.

jzebedee commented 9 years ago

Pretty much! deltaq sprang out of my problems trying to work with bsdiff and vcdiff format patches in C#, especially the lack of library support and buggy, non-performant implementations.

The bsdiff methods will work just like bsdiff.net if that's what you've been using. The deltaq implementation is a performance-focused rewrite that also fixes bugs in both the original bsdiff and bsdiff.net.

deltaq also supports streaming patching (if you know a way to do streaming patch creation in bsdiff without making a whole new algorithm, let me know!) so that you can work with memory-constrained platforms, very large files, remote patches, etc.

nbevans commented 9 years ago

Good stuff.

Out of interest, do you know why BsDiff seems to be inextricably tied to BZip2? And whether it would be possible to decouple (or at least disable) the compression?

I am only wondering why as it seems inefficient if you have a large bunch of diffs for individual records that you want to compress as a single unit. Because some of the biggest gains in compression ratio come from when you place multiple "files" into the same container; allowing cross-file dictionaries and such like.

jzebedee commented 9 years ago

Bz2 is what Colin Percival chose to use, so it's tied to the format. There's no reason you couldn't use your own choice of compression algorithm if you're okay with breaking compatibility -- that's exactly what I did here.

It sounds like you'd want to create deltas (with no compression) and then package it all into a single archive. I made a quick branch here that shows how to tear out compression, it's very easy.

nbevans commented 9 years ago

Sorted it, cheers. Pulled out the BZip2 to be applied only after all my record diffs have been generated. Far better compression ratio this way.

I had wanted to use LZMA but seems all the LZMA libraries for .NET are awful! The official one from 7-zip SDK doesn't even have proper Stream implementations.

jzebedee commented 9 years ago

Great!

Purely managed LZMA implementations are in a pretty sorry state, and they can't compete with the performance of the 7z library itself. SevenZipSharp was the best one I found, but it's only a wrapper around the native 7zip DLL. :cry: