lrzip -L9 performs worse than lrzip -L8

pete4abw commented 2 years ago

In the open_stream_out() function, the logic for computing block sizes within chunks performs erroneously for -L9.

For example, with a tar'red set of directories /bin /sbin /usr/bin /usr/sbin of 1.4GB, lrzip -L8 creates 10 blocks in stream 1 with a block size of 110MB plus 1 in stream 0. However, for level 9, it creates 102 blocks for stream 1 with the minimum STREAM_BUFSIZE of 10MB. The impact of this is less compression for lrzip -L9 than for lrzip -L8 even though it has a larger dictionary size (64MB vs 32MB). The logic, while very compact, is buggy.

Here's some INFO output from lrzip-next

$ lrzip-next -vvi bin-sbin.tar.L8.641.lrz
Using configuration file /home/peter/.lrzip/lrzip.conf
Detected lrzip version 0.6 file.
Rzip chunk:       1
Chunk byte width: 4
Chunk size:       1,491,312,640
Stream: 0
Offset: 30
Block   Comp    Percent        Comp Size /     UComp Size            Offset :           Head
1       lzma     43.5%         4,255,085 /      9,783,865       283,838,132 :              0
Stream: 1
Offset: 43
Block   Comp    Percent        Comp Size /     UComp Size            Offset :           Head
1       lzma     31.8%        35,180,255 /    110,521,686                56 :     35,180,294
2       lzma     26.7%        29,554,788 /    110,521,686        35,180,324 :     64,735,095
3       lzma     30.0%        33,106,398 /    110,521,686        64,735,125 :     97,841,506
4       lzma     29.1%        32,156,343 /    110,521,686        97,841,536 :    129,997,862
5       lzma     27.8%        30,728,281 /    110,521,686       129,997,892 :    160,726,156
6       lzma     19.0%        21,049,064 /    110,521,686       160,726,186 :    181,775,233
7       lzma     29.4%        32,499,988 /    110,521,686       181,775,263 :    214,275,234
8       lzma     34.9%        38,535,365 /    110,521,686       214,275,264 :    252,810,612
9       lzma     28.1%        31,027,477 /    110,521,686       252,810,642 :    288,093,200
10      lzma     28.3%        18,681,489 /     66,027,752       288,093,230 :              0

Summary
=======
File: bin-sbin.tar.L8.641.lrz
lrzip-next version: 0.6 file

  Stats         Percent       Compressed /   Uncompressed
  -------------------------------------------------------
  Rzip:          71.8%     1,070,506,791 /  1,491,312,640
  Back end:      28.7%       306,774,533 /  1,070,506,791
  Overall:       20.6%       306,774,533 /  1,491,312,640

  Compression Method: rzip + lzma -- lc = 3, lp = 0, pb = 2, Dictionary Size = 33,554,432

  Decompressed file size:  1,491,312,640
  Compressed file size:      306,774,748
  Compression ratio:               4.861x

  MD5 Checksum: b409aa79206a42f53803fbbcc558c6ff

and for level 9:

Block   Comp    Percent        Comp Size /     UComp Size            Offset :           Head
1       lzma     43.7%         4,249,921 /      9,720,382       312,921,030 :              0
Stream: 1
Offset: 43
Block   Comp    Percent        Comp Size /     UComp Size            Offset :           Head
1       lzma     39.8%         4,171,907 /     10,485,760                56 :      4,171,946
2       lzma     37.7%         3,954,966 /     10,485,760         4,171,976 :      8,126,925
3       lzma     36.7%         3,852,056 /     10,485,760         8,126,955 :     11,978,994
. . .
99      lzma     30.4%         3,185,157 /     10,485,760       303,339,421 :    306,524,561
100     lzma     30.9%         3,241,797 /     10,485,760       306,524,591 :    309,766,371
101     lzma     30.1%         3,154,616 /     10,485,760       309,766,401 :    317,170,934
102     lzma     21.6%            32,269 /        149,204       317,170,964 :              0

Summary
=======
File: bin-sbin.tar.L9.641.lrz
lrzip-next version: 0.6 file

  Stats         Percent       Compressed /   Uncompressed
  -------------------------------------------------------
  Rzip:          71.7%     1,068,931,346 /  1,491,312,640
  Back end:      29.7%       317,201,851 /  1,068,931,346
  Overall:       21.3%       317,201,851 /  1,491,312,640

  Compression Method: rzip + lzma -- lc = 3, lp = 0, pb = 2, Dictionary Size = 67,108,864

  Decompressed file size:  1,491,312,640
  Compressed file size:      317,203,262
  Compression ratio:               4.701x

  MD5 Checksum: b409aa79206a42f53803fbbcc558c6ff

ckolivas commented 2 years ago

Unable to reproduce this here. Is there a possibility you had a lot of ram in use at the time?

pete4abw commented 2 years ago

DEBUG output: stream.c.patch.gz

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1.3Gi        10Gi       430Mi       3.6Gi        13Gi
Swap:           15Gi          0B        15Gi

some DEBUG output. As you can see, when testsize > usable ram, it forces limit negative for L9. I totally rewrote this logic in lrzip-next - basically reducing the overhead each iteration by reducing the lzma dictionary size. Same logic applies to zpaq where block size is reduced.

source=enwik9, 1GB lrzip = 0.650 compile patch with -DDEBUG

$ ./lrzip -fL9 enwik9
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16558137344 bytes ram
Compression level 9
Nice Value: 19
Show Progress
Verbose
Overwrite Files
Output Directory Specified: /tmp/lrzip/
Temporary Directory set as: ./
Compression mode is: LZMA. LZ4 Compressibility testing enabled
Heuristically Computed Compression Window: 105 = 10500MB
Output filename is: /tmp/lrzip/enwik9.lrz
File size: 1000000000
Will take 1 pass
Chunk Limit = 1,000,000,000
Usable Ram = 5,519,379,114
testsize = 9,002,537,984, limit = 1,000,000,000, overhead = 778,059,776, testbufs = 2, (limit*testbufs)+(overhead*threads) = testsize
testsize = 9,002,537,984 (>usable_ram), limit = -741,579,435
In while loop: limit = -352,549,547, Threads = 8, (Usable ram-(overhead*threads))/testbufs = -352,549,547
In while loop: limit = 36,480,341, Threads = 7, (Usable ram-(overhead*threads))/testbufs = 36,480,341
testsize = 5,482,898,773, limit+(overhead*threads) = 5,482,898,773
bufsize = 10,485,760, limit = 36,480,341, (limit+threads-1)/threads = 5,211,478, threads = 7

for level 8

$ ./lrzip -fL8 enwik9
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 8
Detected 16558137344 bytes ram
Compression level 8
Nice Value: 19
Show Progress
Verbose
Overwrite Files
Output Directory Specified: /tmp/lrzip/
Temporary Directory set as: ./
Compression mode is: LZMA. LZ4 Compressibility testing enabled
Heuristically Computed Compression Window: 105 = 10500MB
Output filename is: /tmp/lrzip/enwik9.lrz
File size: 1000000000
Will take 1 pass
Chunk Limit = 1,000,000,000
Usable Ram = 5,519,379,114
testsize = 5,529,654,272, limit = 1,000,000,000, overhead = 392,183,808, testbufs = 2, (limit*testbufs)+(overhead*threads) = testsize
testsize = 5,529,654,272 (>usable_ram), limit = 994,862,421
testsize = 4,524,516,693, limit+(overhead*threads) = 4,524,516,693
bufsize = 110,540,269, limit = 994,862,421, (limit+threads-1)/threads = 110,540,269, threads = 9

pete4abw commented 2 years ago

Doesn't matter if it is not fixed.

ckolivas / lrzip

lrzip -L9 performs worse than lrzip -L8 #215