Open HotDenim opened 9 years ago
When creating a delta from Ultimate Edition to Test Professional Edition, the delta file size is 160 MB
What parameters did you use?
I tried it with all, including the Defaults.. Here is one i tried it with also:
-e -9 -S lzma
In that case, you have to wait for 64bit hash version, that version also should have -B values greater than 2GB
Check this one https://github.com/jmacd/xdelta-devel/tree/64bithash from time to time.
Or maybe jmacd could implement offset option. Because, in your case, Test_Professional is a subset of Ultimate, but not the first 2GB of Ultimate.
Can you be more explicit / Verbose / comprehensive in your reply. As I am not very familiar with xDelta, just used it for one session.
In that case, you have to wait for 64bit hash version, that version also should have -B values greater than 2GB What is the effect of -B values greater than 2GB
Check this one https://github.com/jmacd/xdelta-devel/tree/64bithash What is this you are directing me to, and Why ?
Or maybe jmacd could implement offset option. What will that option do ?
For now, you can use jojodiff (download link) How do jojodiff comare to xdelta, in all ways ?. I thought xdelta was the best..
What is the effect of -B values greater than 2GB
https://github.com/jmacd/xdelta/blob/wiki/TuningMemoryBudget.md
What is this you are directing me to, and Why ?
Again, read TuningMemoryBudget. Bigger -B, smaller patches in general. Currently -B can not be bigger than 2GB, you have to wait...
What will that option do ?
It could skip the beginning of source file and load the rest.
How do jojodiff comare to xdelta, in all ways ?. I thought xdelta was the best.
xdelta is best. JojoDiff is good too, it's using a heuristic algorithm, accuracy is traded over speed. Not always finds the smallest set of differences, but, it doesn't require big buffers.
One extra example what bigger -B switch could do if jmacd implement it.
For now, we have to simulate it (it is not a perfect simulation). We can use 7zip for this. Add to archive "Ultimate" with 7zip, method: store, volume size: 1GB
You will get three files _VS2013_RTM_ULTENU.7z.001 (1GB) _VS2013_RTM_ULTENU.7z.002 (1GB) _VS2013_RTM_ULTENU.7z.003 (836MB)
Then this
xdelta3 -0 -I 0 -B 1000000000 -W 16777216 -fves VS2013_RTM_ULT_ENU.7z.001 VS2013_RTM_TESTPRO_ENU.iso intermediate1.xd3
xdelta3 -0 -I 0 -B 1000000000 -W 16777216 -fves VS2013_RTM_ULT_ENU.7z.002 intermediate1.xd3 intermediate2.xd3
xdelta3 -0 -I 0 -B 1000000000 -W 16777216 -fves VS2013_RTM_ULT_ENU.7z.003 intermediate2.xd3 final.xd3
intermediate1.xd3 will be 67.5MB intermediate2.xd3 will be 29.7MB final.xd3 will be 12.3MB
Note, because this is not a perfect simulation, I think final.xd3 delta file size is much bigger than delta file created with newer xdelta3 (with big -B support) will be.
Then, you can delete intermediate1.xd3 and intermediate2.xd3 files. You will only need final.xd3 and those three parts.
To decode, do this:
xdelta3 -fvds VS2013_RTM_ULT_ENU.7z.003 final.xd3 intermediate2.xd3
xdelta3 -fvds VS2013_RTM_ULT_ENU.7z.002 intermediate2.xd3 intermediate1.xd3
xdelta3 -fvds VS2013_RTM_ULT_ENU.7z.001 intermediate1.xd3 otherVS2013_RTM_TESTPRO_ENU.iso
EDIT: Accidentally enabled 7zip compression. Now it is fixed. Reread this post again.
mgrinzPlayer:
A side question:
For creating the smallest Delta File, I have deduced that the following parameters are the 'best' ones:
-e -9 -S lzma -B 2147483648 -W 16777216 -I 0 -P 16777216
Am I correct ?, or can you suggest any other paramater/values for chances of a smaller delta file
It depends.
Personally, I'm using this
-9 -S none -B 2000000000 -I 0 -e -s oldfile newfile deltafile
then compress delta file with thirdparty tool like 7zip or FreeArc or WinRAR5
About previous simulation, I accidentally enabled compression in 7zip (mouse wheel changed 'store' to 'fastest'), Here are correct statistics:
You will get three files _VS2013_RTM_ULTENU.7z.001 (1GB) _VS2013_RTM_ULTENU.7z.002 (1GB) _VS2013_RTM_ULTENU.7z.003 (836MB)
Then this
xdelta3 -0 -I 0 -B 1000000000 -W 16777216 -fves VS2013_RTM_ULT_ENU.7z.001 VS2013_RTM_TESTPRO_ENU.iso intermediate1.xd3
xdelta3 -0 -I 0 -B 1000000000 -W 16777216 -fves VS2013_RTM_ULT_ENU.7z.002 intermediate1.xd3 intermediate2.xd3
xdelta3 -0 -I 0 -B 1000000000 -W 16777216 -fves VS2013_RTM_ULT_ENU.7z.003 intermediate2.xd3 final.xd3
intermediate1.xd3 will be 67.5MB intermediate2.xd3 will be 29.7MB final.xd3 will be 12.3MB
As you see, final.xd3 is about the same as SmartVersion or JoJoDiff deltas.
Why does it 'Depend' ?. Also can you see anything wrong/inadequte with my options and thier values?
-9 -S lzma -B 2000000000 -I 0 -e -s oldfile newfile deltafile
or
-9 -S lzma -B 2000000000 -W 16777216 -I 0 -P 16777216 -e -s oldfile newfile deltafile
Both are good for source files bigger than few hundreds megabytes. As I said earlier, I prefer to use third party compression tool, so I don't use secondary compression (-S parameter).
-9 -S none -B 2000000000 -I 0 -e -s oldfile newfile deltafile
For smaller files just use
-9 -S lzma -I 0 -e -s oldfile newfile deltafile
mgrinzPlayer, I appreciate your support!
I've been busy at work but find myself just now beginning a leave (parental--new child) and think I'll be able to find some time to get back to the 64bit hash changes.
But it'll be a couple of weeks at least before I find any time at all.
On Sat, Sep 5, 2015 at 12:54 PM, mgrinzPlayer notifications@github.com wrote:
-9 -S lzma -B 2000000000 -I 0 -e -s oldfile newfile deltafile
or
-9 -S lzma -B 2000000000 -W 16777216 -I 0 -P 16777216 -e -s oldfile newfile deltafile
Both are good for source files bigger than few hundreds megabytes.
As I said earlier, I prefer to use third party compression tool, so I don't use secondary compression (-S parameter).
-9 -S lzma -B 2000000000 -I 0 -e -s oldfile newfile deltafile
For smaller files just use
-9 -S lzma -I 0 -e -s oldfile newfile deltafile
— Reply to this email directly or view it on GitHub https://github.com/jmacd/xdelta/issues/203#issuecomment-137989163.
Hi, As noted above, the root cause of the poor performance on your test case is issue 127, the lack of support for 64bit source buffer. That's fixed now.
I was able to verify that xdelta3 on the 64bithash branch computes a 12MB delta for the test case here, when configured with lzma secondary compression. Thank you for the test case. I'm not quite ready to release the changes, but will do so soon. Josh
Here, that is: https://github.com/jmacd/xdelta-devel/tree/64bithash
3.1.0 is released with this fix
Adds support for -B values greater than 2GB, enabled by -DXD3_USE_LARGESIZET=1 variable
How do I use the -B values greater than 2GB ?. Can you provide an example?
Also where can I find documentation for the command-line options (Updated documentation)
Jmacd (specifically):
Can you provide the command-line option values for the options -B -W -I -P options (and any other options) that would create the smallest delta file (before compression).
I verified that this works with
./xdelta3 -B 4294967296 -vf -e -s ~/VS2013_RTM_ULT_ENU.iso ~/VS2013_RTM_TESTPRO_ENU.iso VS_ULT_TESTPRO.xdelta
xdelta3: secondary compression: lzma xdelta3: source /volume/home/jmacd/VS2013_RTM_ULT_ENU.iso source size 2.82 GiB [3024457728] blksize 4.00 GiB window 4.00 GiB (FIFO) xdelta3: 0: in 8.00 MiB: out 6.52 MiB: total in 8.00 MiB: out 6.52 MiB: 30 sec xdelta3: 1: in 8.00 MiB: out 4.61 MiB: total in 16.0 MiB: out 11.1 MiB: 2.8 sec xdelta3: 2: in 8.00 MiB: out 29.0 B: total in 24.0 MiB: out 11.1 MiB: 11 ms xdelta3: 3: in 8.00 MiB: out 117 KiB: total in 32.0 MiB: out 11.2 MiB: 199 ms xdelta3: 4: in 8.00 MiB: out 29.0 B: total in 40.0 MiB: out 11.2 MiB: 77 ms xdelta3: 5: in 8.00 MiB: out 29.0 B: total in 48.0 MiB: out 11.2 MiB: 74 ms xdelta3: 6: in 8.00 MiB: out 29.0 B: total in 56.0 MiB: out 11.2 MiB: 74 ms xdelta3: 7: in 8.00 MiB: out 29.0 B: total in 64.0 MiB: out 11.2 MiB: 66 ms xdelta3: 8: in 8.00 MiB: out 29.0 B: total in 72.0 MiB: out 11.2 MiB: 68 ms xdelta3: 9: in 8.00 MiB: out 29.0 B: total in 80.0 MiB: out 11.2 MiB: 58 ms xdelta3: 10: in 8.00 MiB: out 29.0 B: total in 88.0 MiB: out 11.2 MiB: 87 ms xdelta3: 11: in 8.00 MiB: out 29.0 B: total in 96.0 MiB: out 11.2 MiB: 82 ms xdelta3: 12: in 8.00 MiB: out 38.0 B: total in 104 MiB: out 11.2 MiB: 74 ms xdelta3: 13: in 8.00 MiB: out 45.0 B: total in 112 MiB: out 11.2 MiB: 103 ms xdelta3: 14: in 8.00 MiB: out 29.0 B: total in 120 MiB: out 11.2 MiB: 77 ms xdelta3: 15: in 8.00 MiB: out 40.0 B: total in 128 MiB: out 11.2 MiB: 99 ms xdelta3: 16: in 8.00 MiB: out 535 KiB: total in 136 MiB: out 11.8 MiB: 383 ms xdelta3: 17: in 8.00 MiB: out 29.0 B: total in 144 MiB: out 11.8 MiB: 76 ms xdelta3: 18: in 8.00 MiB: out 40.0 B: total in 152 MiB: out 11.8 MiB: 92 ms xdelta3: 19: in 8.00 MiB: out 29.0 B: total in 160 MiB: out 11.8 MiB: 61 ms xdelta3: 20: in 8.00 MiB: out 29.0 B: total in 168 MiB: out 11.8 MiB: 72 ms xdelta3: 21: in 3.40 MiB: out 29.0 B: total in 171 MiB: out 11.8 MiB: 52 ms xdelta3: finished in 54 sec; input 179724288 output 12340595 bytes (6.87%)
Note that this includes LZMA "secondary" compression, so you don't want to use an additional compression step.
Note also that I have bug reports of this new 64-bit support not working on Windows, or possibly this case works but some other cases don't even on Linux, OSX, etc. I'm investigating.
Unfortunately, the regression test I wrote for this depends on Linux system calls, so I'll have to work through that before I can run it on my Windows box.
On Sat, Jan 16, 2016 at 2:42 PM, HotDenim notifications@github.com wrote:
Can you provide the commandline options for -B -W -I -P options (and any other options) that would create the smallest delta file (before compression).
— Reply to this email directly or view it on GitHub.
Jmacd (specifically):
Adds support for -B values greater than 2GB, enabled by -DXD3_USE_LARGESIZET=1 variable
So you just specify a larger -B value ?, no need to set -DXD3_USE_LARGESIZET=1 variable ??
When I try i receive:
xdelta3: malloc: The access code is invalid. xdelta3: out of memory: The access code is invalid.
Also: Can you provide the command-line option values for the options -B -W -I -P options (and any other options) that would create the smallest delta file (before compression). Not just something that 'works'
The -DXD3_USE_LARGESIZET variable is set by default in the 3.1.0 distribution. That's a compiler flag.
Are you on Windows? I have to investigate the Windows issue. Which version are you testing?
Otherwise, with version 3.1.0 use a larger value of -B and it should work. There is not a great difference in compression due to the other variables, but you need -B set to 4 gigabytes for the example we're discussing, since the source file is greater than 2 gigabytes.
Yes Windows 7 -With Service Pack #1 - 64-Bit
Testing the 3.1.0 64-bit and 32-bit versions
Setting the -B value 1 Byte past 2GB (2147483648) causes the error
xdelta3: malloc: The access code is invalid. xdelta3: out of memory: The access code is invalid.
There is not a great difference in compression due to the other variables, but you need -B set to 4 gigabytes for the example we're discussing, since the source file is greater than 2 gigabytes.
OK, but I am asking you what are, theoretically the the command-line option values for the options -B -W -I -P options (and any other options) that would create the smallest delta file (before compression). I mean what are the Maximum values each of these parameters can have ( I ask this assuming that the maximum gives the best environment for the smallest delta, theoretically).
OK, so I have to diagnose several Windows issues it seems.
As for the parameters, -B is the only really important one, and we get best compression when -B is set to the size of the source file. It will be rounded-up to a power-of two, which is why when you set it one byte larger than 2GB, you get the problem we're seeing.
The -I, -P, and -W flags are not guaranteed to make better compression by arbitrarily raising their values. I recommend experimenting.
You probably shouldn't change -W, it has more to do with I/O performance than with compression.
The best compression for -I is -I=0. If you run with -vv you'll see warnings when the default setting isn't large enough.
I have some TODOs to look at the default settings for -W, -P, and -I. The defaults are roughly the same they were 10 years ago, but files and memories are larger these days.
1. So having -B even 1 byte larger (or if it is rounded to next power of 2, then one power of 2 larger) than size of Source file has no benefit ?, at all?.
2. With the 64-Bit .EXE: Even if I set -B to a power of 2, 2^32 (4294967296), I get same error. With the 32-Bit .EXE: Even if I set -B to a power of 2, 2^32 (4294967296), I get same error. BUT, the error message does not display. (Tested with Windows 7 SP1 64-Bit and Windows 10 Enterprise 32-Bit)
3. Has the -W parameters maximum value increased ?, I noted as having been 16777216 approx 2 years ago., now it seems it is 67108864.
4. Is the best value for -P, the same as the value for -W ?
5.
-vv
Is that an undocumented option ?, help output only shows -v option. (If so can you update help output)
6. Is xDelta's delta file, a custom file format for xDelta, or is it a general non xDelta file format ? (if so what is the file format called, and it's extension)
On Sat, Jan 16, 2016 at 7:23 PM, HotDenim notifications@github.com wrote:
So having -B even 1 byte larger (or if it is rounded to next power of 2, then one power of 2 larger) than size of Source file has no benefit ?, at all?.
-B determines the size of a buffer that is used to read the Source file--and it needs to be a power-of-two, so yes, there is no benefit in making it any larger than the file size.
Even if I set -B to a power of 2, 2^32 (4294967296), I get same error.
There is a memory allocation problem. I forget that Windows needs a special API call to allocate > 2GB. I'll work on it, but it will be at least a week.
Has the -W parameters maximum value increased ?, I noted as having been 16777216 approx 2 years ago., now it seems it is 67108864.
Hm, I thought it was always 8MB. There are diminishing returns for larger windows.
Is the best value for -P, the same as the value for -W ?
Usually, but best to experiment.
— Reply to this email directly or view it on GitHub https://github.com/jmacd/xdelta/issues/203#issuecomment-172286789.
jmacd commented on Jan 18
There is a memory allocation problem. I forget that Windows needs a special API call to allocate > 2GB. I'll work on it, but it will be at least a week.
It has been long over a week since the problem was mentioned. What is the status?.
I'm sorry I have no updates. In the intervening time, I worked on the (recently announced) license change and (just last evening, actually) installed Wine which I think will help me run my POSIX-only test harness against the Windows executable, and see if I can reproduce the problem.
...any more news on this ?... or will there be soon ?....
From: Joshua MacDonald notifications@github.com Sent: 03 May 2016 22:43 To: jmacd/xdelta Cc: HotDenim; Author Subject: Re: [jmacd/xdelta] Unusual Delta output file size (Unsual comparing to the similar 'SmartVersion''s delta output file size) (#203)
I'm sorry I have no updates. In the intervening time, I worked on the (recently announced) license change and (just last evening, actually) installed Wine which I think will help me run my POSIX-only test harness against the Windows executable, and see if I can reproduce the problem.
You are receiving this because you authored the thread. Reply to this email directly or view it on GitHubhttps://github.com/jmacd/xdelta/issues/203#issuecomment-216673835
still/again:
...any more news on this ?... or will there be soon ?....
(Been waiting many years for this).
Grabbed two ISO images of Visual Studio again: http://download.microsoft.com/download/5/7/A/57A99666-126E-42FA-8E70-862EDBADD215/vs2015.1.com_enu.iso http://download.microsoft.com/download/F/9/7/F9775608-F90B-4586-9337-E62671AE186D/vs2015.1.com_deu.iso
Delta between them is just 140MB! That's great... except "restored" file is ALSO 140MB... which is NOT so great.
This is on Linux, BTW.
Now smartversion is opensource at https://github.com/gvollant/smartversion , so you'll be able to compare delta code
Unusual Delta output file size. Using xdelta on the 'Visual Studio 2013' Retail editions .ISO images.
The .ISO files can be downloaded from here:
Ultimate Edition (2.9 GB): http://go.microsoft.com/fwlink/?LinkId=320679 Test Professional Edition (175 MB): http://download.microsoft.com/download/4/0/6/406E397F-EDBE-4437-B64F-40DF7A92A26E/VS2013_RTM_TESTPRO_ENU.iso
Premium Edition (2.9 GB): http://go.microsoft.com/fwlink/?LinkId=320676 Professional Edition (2.9 GB): http://go.microsoft.com/fwlink/?LinkId=320673
When creating a delta from Ultimate Edition to Test Professional Edition, the delta file size is 160 MB. I tried the same process with SmartVersion (www.smartversion.com) and it produces a delta file of size 12 MB (more realistic considering the differences within the .ISO file). The Test_Professional is a subset of Ultimate (Test_Professional .ISO is only 175 MB).
Can you explain ?. Surely xDelta should be producing something in the regions of what SmartVersion correctly produces ?, yes?.
(As Expected: creating a delta from 'Ultimate' to ,'Premium' or 'Professional' , creates small deltas of approx 600Kb)