sisong / HDiffPatch

a C\C++ library and command-line tools for Diff & Patch between binary files or directories(folder); cross-platform; runs fast; create small delta/differential; support large files and limit memory requires when diff & patch.
Other
1.52k stars 280 forks source link

Merge multiple dir diffs to patch faster #381

Closed Zeblote closed 3 months ago

Zeblote commented 3 months ago

Hi, is there any mechanism for merging dir diffs or applying multiple of them together?

For example, imagine we have these diffs

V1 -> V2 V2 -> V3 ... V9 -> V10

And now a user has V1, but the most recent version is V10.

We can of course apply all 9 diffs in order, re-writing all of the files every time. But that's much slower than it would be to apply a single diff from V1 -> V10.

Is there a way to merge all those diffs, or apply them together somehow, and only re-write the files once?

sisong commented 3 months ago

The current hdiffpacth does not support this feature (which was previously supported by the company's internal version), Because I think it's better like this:

V1 -> V10
V2 -> V10
...
V9 -> V10

Provide as many specialized patches as possible to give the best experience to all users.

Zeblote commented 3 months ago

I see, that is unfortunate, as creating multiple patches takes more time (we are using this in a CI/CD environment to bring the user up to date with the latest version from whatever they have). But if it doesn't exist yet it's probably not worth the effort to add it just for such an edge case.

Might look into making patches going back 2, 4, 8, etc versions so there's a "tree" you can follow to update in less steps.

sisong commented 3 months ago

For very old versions, can use hsynz? ( It can fast created all diffFiles for between every version and current version; or do diff & patch by client self)

Zeblote commented 3 months ago

I have not heard of that before. Sounds interesting! Will check it out.

sisong commented 3 months ago

Have you tested using "-s-?" to control hdiffz speed? (this is a minor change)

Zeblote commented 3 months ago

Still using -s-48 like you had suggested a few years ago. I'm not entirely sure what the parameter actually does behind the scenes.