Open GoogleCodeExporter opened 9 years ago
If I understand correctly, you're "now" situation is using what we call
"forward" deltas.
The "incremental" solution you're looking for would be to use "reverse" deltas,
in
which you always store a recent copy of the data along with a delta per day
going
backwards in time (i.e., reverse), where each daily delta produces the full
data for the
day before.
The challenge is to efficiently compute several days ago, without applying
deltas one
at a time. Is that right? That's what the "xdelta3 merge" command is for.
I'd say it's
somewhat experimental. I know there are users, but there is at least one
unconfirmed bug report.
Original comment by josh.mac...@gmail.com
on 23 Feb 2010 at 4:45
Hello Josh,
Thanks for replying.
I knew this was kind of tough to explain. I'll try to be more specific:
If i wasn't using xdelta3 at all, the following files would get produced:
day 1: report1.rpt (50MB)
day 2: report2.rpt (50MB)
day 3: report3.rpt (50MB)
day 4: report4.rpt (50MB)
So in order to reduce the amount of data that is stored on disk I use
report1.rpt as
a base file and "compress" all reports day 2+ by replacing them with a diff
file
(using xdelta) based on report1.rpt.
So:
day 1: report1.rpt (50MB)
day 2: report1.rpt.diff1 (2MB) based on report1.rpt.
day 3: report1.rpt.diff2 (2.2MB) based on report1.rpt
day 4: report1.rpt.diff3 (2.4MB) based on report1.rpt
Where:
xdelta3 -d -s report1.rpt.diff1 report1.rpt will give original report2.rpt
xdelta3 -d -s report1.rpt.diff2 report1.rpt will give original report3.rpt
xdelta3 -d -s report1.rpt.diff3 report1.rpt will give original report4.rpt
Alright. So now after day 1 (50MB) each night the differences between the
current
file and the day 1 file will be stored. This saves a lot of data but still
there
could (potentially) be even more saving by only saving the differences to the
respective day before.
day 1: report1.rpt (50MB)
day 2: report1.rpt.inc1 (2MB) based on report1.rpt.
day 3: report1.rpt.inc2 (0.2MB) based on report1.rpt.inc1 based on report1.rpt
day 4: report1.rpt.inc3 (0.4MB) based on report1.rpt.inc2 based on
report1.rpt.inc1
based on report1.rpt
So in conclusion I'm not 100% sure if I understand your forward vs. backward
approach. In my understanding its both the same, just that the "diff" is only 1
patch where "inc(remental)" is chaining multiple patches together.
How to produce the patches?
Right now (diff) I'm simply using the following "compression" syntax:
day 1: nothing
day 2: xdelta3 -e -9 -s report1.rpt report2.rpt report1.rpt.diff1; del
report2.rpt
day 3: xdelta3 -e -9 -s report1.rpt report3.rpt report1.rpt.diff2; del
report3.rpt
day 4: xdelta3 -e -9 -s report1.rpt report4.rpt report1.rpt.diff3; del
report4.rpt
For incremental I guess this could work:
day 1: nothing
day 2: xdelta3 -e -9 -s report1.rpt report2.rpt report1.rpt.inc1; del
report2.rpt
day 3: xdelta3 -d -s report1.rpt report1.rpt.inc1 | xdelta3 -e -9 report3.rpt
report1.rpt.inc2; del report3.rpt
day 4: xdelta3 -d -s report1.rpt report1.rpt.inc1 | xdelta3 -d report1.rpt.inc2
|
xdelta3 -e -9 report4.rpt report1.rpt.inc3; del report4.rpt
Would that work?
I was unable to test this so far since running xdelta3 with stdin as input
gives me
the following error on windows (3.0y): Not enough memory for this command (OS
error
8).
Original comment by ralfo...@googlemail.com
on 23 Feb 2010 at 8:41
The first scenario you explain is called "forward deltas", it has the advantage
that
every version can be restored by reading only two files.
The second scenario you explain seems more difficult than necessary. If
instead of
storing report1.rpt and attempted to store deltas as you explained, why not
store
report2.rpt on the second day along with a reverse delta (report2 -> report1)?
Then
on day three you save report3.rpt and store the reverse delta (report 3 ->
report2)?
Then you can use the "xdelta3 merge" command to merge report3->report2-
>report1 and restore report1.rpt.
Still, what you propose should work and does work on UNIX. I need to get my
act
(w.r.t streaming source and inputs) together on Windows, sorry I haven't made
it
work there yet.
Original comment by josh.mac...@gmail.com
on 28 Feb 2010 at 8:10
Hi Josh,
what you are proposing:
------
If instead of
storing report1.rpt and attempted to store deltas as you explained, why not
store
report2.rpt on the second day along with a reverse delta (report2 -> report1)?
Then
on day three you save report3.rpt and store the reverse delta (report 3 ->
report2)?
------
If I understand your approach that would mean I would delay deleting the
original
"full" file of each day by one day? (Just that I understand your take)
Original comment by ralfo...@googlemail.com
on 3 Mar 2010 at 3:47
Given the recommendation to do reverse deltas, it would be helpful to have some
way for xdelta to do the copy and diff generation simultaneously. I'm using
xdelta to back up multi-gigabyte Hyper-V virtual hard disk files, and so it
takes about 10 minutes to copy the file to the backup location, plus another 10
minutes to generate the diff. I'm guessing the total time could be cut nearly
in half if the copy and the diff could be done in one pass.
Original comment by mjhan...@gmail.com
on 8 Dec 2011 at 4:26
I am working on a Linux computer and am also interested in incremental patches.
I currently store "report1.rpt" on disk and apply incremental patches to get
back my current report. However, I'm having an issue with xdelta being unable
to calculate dynamically rebuilt reports over pipes. Example :
cat report3.rpt | xdelta3 -s <(cat report1.rpt | xdelta3 -Rd report1.rpt.inc1 |
xdelta3 -Rd report1.rpt.inc2) report1.rpt.inc3
Any idea why this isn't working?
Original comment by m...@markelee.com
on 3 Oct 2014 at 6:13
Original issue reported on code.google.com by
ralfo...@googlemail.com
on 22 Feb 2010 at 3:41