fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
MIT License
275 stars 25 forks source link

Accelerating backups #81

Closed Erol-2022 closed 11 months ago

Erol-2022 commented 1 year ago

Concerning full and incremental backups, zpaq seems to operate faster than zpaqfranz. Is it possible to accelerate the data processing of zpaqfranz?

Operating system : Oracle Linux Server 8.8 Hardware : Hp DL 380 Gen 10 Server

Here are my observations :

First full backup with zpaq compiled from the source code :

cd /data/personel/Backup3
/root/zpaq a "/data/vault/Backup3/SURNAME?????.zpaq" NAMESURNAME/ -index /data/vault/Backup3/SURNAME-index.zpaq -not "NAMESURNAME/Archive/" -not "NAMESURNAME/*.zpaq"

17407 +added, 0 -removed.

0.000000 + (50372.873543 -> 44129.901677 -> 40693.868012) = 40693.868012 MB
668.592 seconds (all OK)

First full backup with zpaqfranz compiled from the source code :

cd /data/personel/Backup3
/root/zpaqfranz a "/data/vault/Backup3/NAMESURNAME?????.zpaq" NAMESURNAME/ -index /data/vault/Backup3/NAMESURNAME-index.zpaq -not "NAMESURNAME/Archive/" -not "NAMESURNAME/*.zpaq"

zpaqfranz v58.8k-JIT-L(2023-08-05)
franz:-index                /data/vault/Backup3/NAMESURNAME-index.zpaq
franz:-not                        NAMESURNAME/Archive/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
franz:-not                        NAMESURNAME/Archive/
franz:-not                        NAMESURNAME/*.zpaq
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Creating /data/vault/Backup3/NAMESURNAME00001.zpaq at offset 0 + 0
Add 2023-09-25 11:48:09    16.732     50.372.873.543 (  46.91 GB) 16T (675 dirs)
17.407 +added, 0 -removed.

0 + (50.372.873.543 -> 44.129.901.677 -> 40.694.378.209) = 40.694.378.209 @ 57.80 MB/s

831.166 seconds (000:13:51) (all OK)

zpaqfranz takes 831 seconds. zpaq's timing is 668 seconds to process the same data set.

First incremental backup with zpaqfranz :

zpaqfranz v58.8k-JIT-L(2023-08-05)
franz:-index                /data/vault/Backup3/NAMESURNAME-index.zpaq
franz:-not                        NAMESURNAME/Archive/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
franz:-not                        NAMESURNAME/Archive/
franz:-not                        NAMESURNAME/*.zpaq
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
/data/vault/Backup3/NAMESURNAME-index.zpaq:
1 versions, 17.407 files, 19.655.875 bytes (18.75 MB)
Creating /data/vault/Backup3/NAMESURNAME00002.zpaq at offset 0 + 40.694.378.209
Add 2023-09-25 12:11:01     4.610      7.013.734.443 (   6.53 GB) 16T (675 dirs)
6 +added, 0 -removed.

0 + (7.013.734.443 -> 1.169.329 -> 626.257) = 626.257 @ 94.10 MB/s

71.082 seconds (000:01:11) (all OK)

First incremental backup with zpaq :

zpaq v7.15 journaling archiver, compiled Jun 19 2023
/data/vault/Backup3/SURNAME-index.zpaq: 1 versions, 17407 files, 645636 fragments, 19.145678 MB
Creating /data/vault/Backup3/SURNAME00002.zpaq at offset 0 + 40693868012
Adding 4.261564 MB in 3 files -method 14 -threads 16 at 2023-09-25 12:13:21.
.
.
.
6 +added, 0 -removed.

0.000000 + (4.261564 -> 1.169329 -> 0.626137) = 0.626137 MB
0.739 seconds (all OK)

As you can see, zpaq is significantly faster than zpaqfranz : 0.7 vs 71 seconds.

Notice that the purpose of the index file is an attempt to make faster the backup operations. Naturally, no need of the index files in those examples.

fcorbelli commented 1 year ago

The very first "thing" you can check is the -715 switch of zpaqfranz

EDIT: the index file, BTW, does NOT make a faster backup (a slower one, in fact, because write out two times the index data)
Its purpose is NOT to hold the data locally, but only the index

Translation When you perform a backup with zpaq, you keep a complete backup of all your data in a certain folder. Then (perhaps with rsync) you send this folder to a remote server, obtaining a "cloud" backup. But, in this case, you have all the data in the local folder too.

If you instead use -index the data (i.e. .zpaq files) are no longer needed locally. You can delete them (if you are very brave) from the local, keeping only the "cloud" copy
This not only frees up space, but is also more secure because they cannot be encrypted by ransomware (obviously you will need an appropriate rsync command)

Erol-2022 commented 1 year ago

Hi Mr. Corbelli,

You are extremely fast. Thanks for your help. The switch -715 did the job. Reading this one :

<>: zpaqfranz store CRC-32/XXH of each file, detecting SHA-1 collisions, while zpaq cannot by design. Can be disabled by -crc32 or -715,

fcorbelli commented 1 year ago

We need to understand exactly what the reason for the anomalous slowdown is. Essentially zpaqfranz does more things than zpaq, while running, so it's slower. On the development forum you will find a thread of mine on this very topic

Hi Mr. Corbelli,

You are extremely fast. Thanks for your help. The switch -715 did the job. Reading this one :

<>: zpaqfranz store CRC-32/XXH of each file, detecting SHA-1 collisions, while zpaq cannot by design. Can be disabled by -crc32 or -715,

We need to understand exactly what the reason for the anomalous slowdown is. Essentially zpaqfranz does more things than zpaq, while running, so it's slower. On the development forum you will find a thread of mine on this very topic

fcorbelli commented 1 year ago

Let me explain better why zpaqfranz (with the default options) is slower than zpaq.

In addition to the "normal" work done by zpaq, the CRC-32 of the individual blocks are calculated and deduplicated (I cut it short, the issue is complex). This causes the t (test) command not only to operate like that of zpaq, i.e. checking the SHA-1 codes of the individual blocks, but also the CRC-32 of the individual files

Additionally, zpaqfranz calculates a hash of the files that are read. This is, by default, 64-bit XXHASH, but you can take any (including SHA256, SHA3, BLAKE-3, XXH3, and so on)

zpaqfranz a pippo_sha3.zpaq *.cpp -sha3
zpaqfranz a pippo_sha2.zpaq *.cpp -sha256
(...)

These computations, these calculations, require a fast processor. The faster it is, the less impact it has on overall time. But, in any case, it is there

You can see the codes with the -checksum switch of the l command

zpaqfranz l 1.zpaq -checksum

By deactivating the additional calculation, or even using the zpaq standard (i.e. the -715 switch) the times align again (in reality zpaqfranz still does a little more, but not much)

fcorbelli commented 1 year ago

The ultimate goal of zpaqfranz is not to be faster than zpaq, but to be more secure, making it easier to verify the work done, and do a lot of things impossible with zpaq (i.e. mysqldump backup, zfs and so on)

However, when there is a macroscopic difference, e.g. 10x, means that something "wrong" actually exists, and can be investigated

Here's an example on a zpaqfranz archive

C:\zpaqfranz>zpaqfranz t z:\drop.zpaq
zpaqfranz v58.10m-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-09-23)
franz:-hw
z:/drop.zpaq:
1 versions, 23.452 files, 7.714.069.923 bytes (7.18 GB)
To be checked 9.391.819.945 in 20.220 files (32 threads)
7.15 stage time       4.03 no error detected (RAM ~514.07 MB), try CRC-32 (if any)
Checking            25.440 blocks with CRC-32 (9.378.052.726 not-0 bytes)
Block 00025K          8.72 GB
CRC-32 time           0.39s
Blocks       9.378.052.726 (      25.440)
Zeros           13.767.219 (           9) 0.000000 s
Total        9.391.819.945 speed 23.958.724.349/sec (22.31 GB/s)
GOOD            : 00020220 of 00020220 (stored=decompressed)
VERDICT         : OK                   (CRC-32 stored vs decompressed)

4.422 seconds (000:00:04) (all OK)

And on a 715 archive

C:\zpaqfranz>zpaqfranz t z:\715.zpaq
zpaqfranz v58.10m-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-09-23)
franz:-hw
z:/715.zpaq:
1 versions, 44.675 files, 7.714.524.012 bytes (7.18 GB)
To be checked 9.391.680.305 in 20.208 files (32 threads)
7.15 stage time       4.05 no error detected (RAM ~514.07 MB), try CRC-32 (if any)
Checking            25.428 blocks with CRC-32 (9.377.913.086 not-0 bytes)
Block 00025K          8.72 GB
CRC-32 time           0.41s
Blocks       9.377.913.086 (      25.428)
Zeros           13.767.219 (           9) 0.000000 s
Total        9.391.680.305 speed 23.075.381.584/sec (21.49 GB/s)
UNcheck         : 00020208 of 00020208 (zpaq 7.15?)
WARNING         : 00020208 (Cannot say anything)
VERDICT         : UNKNOWN  (Cannot say anything)

Even though the created .zpaq archives look the same, they are not. I had to hack the storage format Mahoney wrote for zpaq to maintain backwards compatibility, so it's not as perfect as I'd like, but it's good enough
Short version: zpaqfranz can compute the CRC-32 of every single file very quickly
Can even (very slowly) compute the hash of every single file (the p command)

Erol-2022 commented 1 year ago

Hi Mr.Corbelli,

Thanks for the explanation. Much appreciated. The calculation of the hashes of a large file set will take much more time. This explains why zpaqfranz takes more time to deduplicate and compress the same file group.

[root@neptun Backup3]# /root/zpaqfranz x "NAMESURNAME?????.zpaq" -test
zpaqfranz v58.8k-JIT-L(2023-08-05)
franz:-test
NAMESURNAME?????.zpaq:
3 versions, 17.414 files, 40.695.005.009 bytes (37.90 GB)
Extract 50.372.873.722 bytes (46.91 GB) in 16.732 files (675 folders) / 16 T
        98.72% 00:00:00  (  46.31 GB)=>(  46.91 GB)  624.00 MB/sec

78.173 seconds (000:01:18) (all OK)
fcorbelli commented 1 year ago

In fact it depends on the average speed-CPU speed ratio, to establish whether it is an IO bound or CPU bound process

With the b (benchmark) command you can get a very crude estimate of the impact of various hash functions

I normally have three scenarios: Desktop Windows PC, with very fast IO (NVMe), very fast CPUs (high clock) Here the impact is practically zero (or in any case in the order of a few % points), due to the disproportion between the processing capabilities of the CPU and the data reading speed

My second main scenario is servers, often virtualized (VPS, Proxmox etc), where the IO is modest and therefore the (not very fast) CPUs can keep up

The third scenario is for very fast IO, and relatively slow CPUs (physical Xeon servers). There is overhead here, but they are usually nighttime crontab jobs, which last hours, and which I don't care about at all if run in 120 minutes or 150.

Then I also have others, namely those with large amounts of memory (even 768GB) where it is convenient to use it as a sort of "ramdisk" (so to speak) to significantly speed up data testing (it is the w command)

In short, zpaqfranz is the union of the solutions to the individual problems that occur to me while working Obviously these are MY scenarios, not all the possible ones For example, I have never used the SFX module even once, despite having been specifically asked to do so and I am particularly proud of it

fcorbelli commented 1 year ago

One last thing: sadly it is not possible to compute, in parallel, a "strong" hash, therefore the single task speed is the key difference
It's actually possible to do this with CRC-32 (use multiple threads), and on the development forum, again, you'll find my related thread. But a good CRC-32 is already so fast, even on anemic CPUs, that the game isn't worth the money. I don't have systems capable of providing 10GB/s reading from disk (already 1.5GB/s sustained "real world" is an excellent result), maybe in the future I'll reconsider

Erol-2022 commented 1 year ago

Hi Mr. Corbelli,

The Hp DL 380 Gen 10 Server is a physical system running Oracle Linux 8.8. It's a powerful hardware :

[root@neptun ~]# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              16
On-line CPU(s) list: 0-15
Thread(s) per core:  2
Core(s) per socket:  8
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel(R) Corporation
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
BIOS Model name:     Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
Stepping:            1
CPU MHz:             3000.000
CPU max MHz:         3000.0000
CPU min MHz:         1200.0000

The CRC-32 calculation of nearly 47 Gb files will take considerable time but this is now understable as this feature is considered for the data safety .

fcorbelli commented 1 year ago

Mmmhh... there is something I am missing
Can you please time TWO consecutive runs of zpaqfranz vs zpaq vs zpaqfranz -715?
I suspect more a "hot caching" rather than CPU Example

zpaqfranz a test1.zpaq (...whatever)
zpaqfranz a test2.zpaq (...whatever)
zpaq a test3.zpaq (...whatever)
zpaq a test4.zpaq (...whatever)
zpaqfranz a test5.zpaq (whatever) -715
zpaqfranz a test6.zpaq (whatever) -715

Please also the output of zpaqfranz b

This will allow you to estimate (very roughly) the time taken for CRC-32 and XXHASH64

fcorbelli commented 1 year ago

Sidenote: if you really want to make a splitted backup (a multipart archive) I suggest the backup command, rather than "????" There is no integrity check for zpaq's multipart archives
You can delete a "piece", or even substitute with another one (!) and zpaq doesn't report anything

Erol-2022 commented 1 year ago

Hi Mr. Corbelli,

Thanks for your support. Tomorrow, I will conduct the experiments you described and study the backup option.

Erol-2022 commented 1 year ago

Hello,

Here are the results

/root/zpaqfranz a "/data/vault/Backup3/archive1-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq"

zpaqfranz v58.8k-JIT-L(2023-08-05) franz:-not NAMESURNAME/Arsiv/

franz:-not NAMESURNAME/Arsiv/ franz:-not NAMESURNAME/*.zpaq

Creating /data/vault/Backup3/archive1-zpaqfranz.zpaq at offset 0 + 0 Add 2023-09-26 08:08:53 4.993 5.562.445.942 ( 5.18 GB) 16T (604 dirs) 5.597 +added, 0 -removed.

0 + (5.562.445.942 -> 5.229.335.519 -> 4.378.456.467) = 4.378.456.467 @ 42.28 MB/s

125.488 seconds (000:02:05) (all OK)

/root/zpaqfranz a "/data/vault/Backup3/archive2-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq"

zpaqfranz v58.8k-JIT-L(2023-08-05) franz:-not NAMESURNAME/Arsiv/

franz:-not NAMESURNAME/Arsiv/ franz:-not NAMESURNAME/*.zpaq

Creating /data/vault/Backup3/archive2-zpaqfranz.zpaq at offset 0 + 0 Add 2023-09-26 08:10:59 4.993 5.562.445.942 ( 5.18 GB) 16T (604 dirs) 84.52% 00:00:09 ( 4.30 GB)->( 3.73 GB)=>( 4.49 GB) 83.17 MB/sec 5.597 +added, 0 -removed.

0 + (5.562.445.942 -> 5.229.335.519 -> 4.378.456.467) = 4.378.456.467 @ 79.52 MB/s

66.749 seconds (000:01:06) (all OK)

--

/root/zpaq a "/data/vault/Backup3/archive3-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq"

5597 +added, 0 -removed.

0.000000 + (5562.445942 -> 5229.335519 -> 4378.304530) = 4378.304530 MB 82.775 seconds (all OK)

/root/zpaq a "/data/vault/Backup3/archive4-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" . . 5597 +added, 0 -removed.

0.000000 + (5562.445942 -> 5229.335519 -> 4378.304530) = 4378.304530 MB 69.953 seconds (all OK)

--

/root/zpaqfranz a "/data/vault/Backup3/archive5-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" -715

zpaqfranz v58.8k-JIT-L(2023-08-05) franz:-not NAMESURNAME/Arsiv/

franz:-not NAMESURNAME/Arsiv/ franz:-not NAMESURNAME/*.zpaq

franz:-715 Activated V7.15 mode T forcezfs,donotforcexls,forcewindows; F crc32,checksum,filelist,xxhash,xxh3,fixeml,fix255,fixreserved,longpath,utf,flat Creating /data/vault/Backup3/archive5-zpaqfranz.zpaq at offset 0 + 0 Add 2023-09-26 08:19:37 4.993 5.562.445.942 ( 5.18 GB) 16T (604 dirs) 5.597 +added, 0 -removed.

0 + (5.562.445.942 -> 5.229.335.519 -> 4.378.304.530) = 4.378.304.530 @ 49.03 MB/s

108.222 seconds (000:01:48) (all OK)

/root/zpaqfranz a "/data/vault/Backup3/archive6-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" -715

zpaqfranz v58.8k-JIT-L(2023-08-05) franz:-not NAMESURNAME/Arsiv/

franz:-not NAMESURNAME/Arsiv/ franz:-not NAMESURNAME/*.zpaq

franz:-715 Activated V7.15 mode T forcezfs,donotforcexls,forcewindows; F crc32,checksum,filelist,xxhash,xxh3,fixeml,fix255,fixreserved,longpath,utf,flat Creating /data/vault/Backup3/archive6-zpaqfranz.zpaq at offset 0 + 0 Add 2023-09-26 08:21:25 4.993 5.562.445.942 ( 5.18 GB) 16T (604 dirs) 63.57% 00:00:33 ( 3.24 GB)->( 2.98 GB)=>( 4.77 GB) 59.20 MB/sec 5.597 +added, 0 -removed.

0 + (5.562.445.942 -> 5.229.335.519 -> 4.378.304.530) = 4.378.304.530 @ 57.85 MB/s

91.730 seconds (000:01:31) (all OK)

Erol-2022 commented 1 year ago

Benchmark tests :

[root@neptun ~]# ./zpaqfranz b zpaqfranz v58.8k-JIT-L(2023-08-05) uname x86_64 full exename seems <</root/zpaqfranz>> Free RAM seems 15.429.787.648 Benchmarks: XXHASH64 XXH3 SHA-1 SHA-256 BLAKE3 CRC-32 CRC-32C WYHASH WHIRLPOOL MD5 SHA-3 NILSIMSA HIGHWAY64 Time limit 5 s (-n X) Chunks of 390.62 KB (-minsize Y)

00000005 s XXHASH64: speed ( 2.90 GB/s) 00000005 s XXH3: speed ( 3.24 GB/s) 00000005 s SHA-1: speed ( 415.42 MB/s) 00000005 s SHA-256: speed ( 126.47 MB/s) 00000005 s BLAKE3: speed ( 319.14 MB/s) 00000005 s CRC-32: speed ( 4.25 GB/s) 00000005 s CRC-32C: speed ( 1.14 GB/s) 00000005 s WYHASH: speed ( 4.24 GB/s) 00000005 s WHIRLPOOL: speed ( 86.62 MB/s) 00000005 s MD5: speed ( 474.47 MB/s) 00000005 s SHA-3: speed ( 190.28 MB/s) 00000005 s NILSIMSA: speed ( 4.25 GB/s) 00000005 s HIGHWAY64: speed ( 708.69 MB/s)

Results:

WHIRLPOOL: 86.62 MB/s (done 433.35 MB) SHA-256: 126.47 MB/s (done 632.48 MB) SHA-3: 190.28 MB/s (done 951.39 MB) BLAKE3: 319.14 MB/s (done 1.56 GB) SHA-1: 415.42 MB/s (done 2.03 GB) MD5: 474.47 MB/s (done 2.32 GB) HIGHWAY64: 708.69 MB/s (done 3.46 GB) CRC-32C: 1.14 GB/s (done 5.69 GB) XXHASH64: 2.90 GB/s (done 14.49 GB) XXH3: 3.24 GB/s (done 16.18 GB) WYHASH: 4.24 GB/s (done 21.22 GB) NILSIMSA: 4.25 GB/s (done 21.24 GB) CRC-32: 4.25 GB/s (done 21.26 GB)

franzomips single thread index 2.077 (quick CPU check, raw 2.077) Atom N2800 (phy) 4 510.38 % Xeon E3 1245 V2 (vir) 4 86.01 % Celeron N5105 (phy) 4 114.26 % i5-6200U (phy) 2 109.16 % Xeon E5 2620 V4 (phy) 8 112.41 % Xeon E5 2630 V4 (phy) 10 133.84 % Xeon D-1541 (vir) 8 102.28 % i5-3570 (phy) 4 70.27 % i7-4790K (phy) 4 63.62 % i7-8700K (phy) 6 61.86 % AMD-Ryzen 7 3700X(phy) 8 63.80 % i9-9900K (phy) 8 54.22 % i9-10900 (phy) 10 56.07 % AMD-5950X (phy) 16 43.29 % i9-12900KS 56400 (phy) 16 38.45 %

65.017 seconds (000:01:05) (all OK)

fcorbelli commented 1 year ago

Benchmark tests : XXHASH64: 2.90 GB/s (done 14.49 GB)
CRC-32: 4.25 GB/s (done 21.26 GB)
Those are, roughly, the "costs" of zpaqfranz (without -nochecksum)

fcorbelli commented 1 year ago

zpaqfranz (with checksums)

/root/zpaqfranz a "/data/vault/Backup3/archive1-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" (...) 125.488 seconds (000:02:05) (all OK)

(...)
66.749 seconds (000:01:06) (all OK)

Filesystem cache do a lot, in your system: twice as fast
Therefore this seems IO bound, more than CPU bound

zpaq

/root/zpaq a "/data/vault/Backup3/archive3-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" (....) 82.775 seconds (all OK)

/root/zpaq a "/data/vault/Backup3/archive4-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" . . 5597 +added, 0 -removed. 69.953 seconds (all OK)

High filesystem impact, not much faster then zpaqfranz (second run, "hot" cache)

zpaqfranz "a-la-715"

/root/zpaqfranz a "/data/vault/Backup3/archive5-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" -715 (...) 108.222 seconds (000:01:48) (all OK)

/root/zpaqfranz a "/data/vault/Backup3/archive6-zpaqfranz.zpaq" NAMESURNAME/ -not "NAMESURNAME/Arsiv/" -not "NAMESURNAME/*.zpaq" -715 (...) 91.730 seconds (000:01:31) (all OK)

As you can see it is not so "easy" to conclude something 😄

One last test: can you re-compress? Just to remove filesystem latency

zpaqfranz a testfile.zpaq (whatever)

zpaqfranz a test1.zpaq testfile.zpaq
zpaqfranz a test2.zpaq testfile.zpaq
zpaq a test3.zpaq testfile.zpaq
zpaq a test4.zpaq testfile.zpaq
Erol-2022 commented 1 year ago

Hello Mr. Corbelli,

I agree with your comments. The option -715 is saving a lot of time especially during incremental backups.

Repeating the first four tests to recreate the archives and repacking them :

zpaqfranz v58.8k-JIT-L(2023-08-05) franz:-not NAMESURNAME/Arsiv/

franz:-not NAMESURNAME/Arsiv/ franz:-not NAMESURNAME/*.zpaq

Creating /data/vault/Backup3/archive1-zpaqfranz.zpaq at offset 0 + 0 Add 2023-09-26 10:49:45 4.993 5.562.445.942 ( 5.18 GB) 16T (604 dirs) 5.597 +added, 0 -removed.

0 + (5.562.445.942 -> 5.229.335.519 -> 4.378.456.467) = 4.378.456.467 @ 41.96 MB/s

126.423 seconds (000:02:06) (all OK)

zpaqfranz v58.8k-JIT-L(2023-08-05) franz:-not NAMESURNAME/Arsiv/

franz:-not NAMESURNAME/Arsiv/ franz:-not NAMESURNAME/*.zpaq

Creating /data/vault/Backup3/archive2-zpaqfranz.zpaq at offset 0 + 0 Add 2023-09-26 10:51:51 4.993 5.562.445.942 ( 5.18 GB) 16T (604 dirs) 5.597 +added, 0 -removed.

0 + (5.562.445.942 -> 5.229.335.519 -> 4.378.456.467) = 4.378.456.467 @ 88.78 MB/s

59.759 seconds (000:00:59) (all OK)

--

zpaq test -> /data/vault/Backup3/archive3-zpaq.zpaq

5597 +added, 0 -removed.

0.000000 + (5562.445942 -> 5229.335519 -> 4378.304530) = 4378.304530 MB 101.066 seconds (all OK)

zpaq test -> /data/vault/Backup3/archive4-zpaq.zpaq

5597 +added, 0 -removed.

0.000000 + (5562.445942 -> 5229.335519 -> 4378.304530) = 4378.304530 MB 49.326 seconds (all OK)

[root@neptun ~]# ./zpaqfranz a test1.zpaq /data/vault/Backup3/archive1-zpaqfranz.zpaq zpaqfranz v58.8k-JIT-L(2023-08-05) Creating test1.zpaq at offset 0 + 0 Add 2023-09-26 10:59:15 1 4.378.456.467 ( 4.08 GB) 16T (0 dirs) 1 +added, 0 -removed.

0 + (4.378.456.467 -> 4.378.411.447 -> 4.359.222.887) = 4.359.222.887 @ 91.39 MB/s

45.727 seconds (000:00:45) (all OK) [root@neptun ~]# ./zpaqfranz a test2.zpaq /data/vault/Backup3/archive2-zpaqfranz.zpaq zpaqfranz v58.8k-JIT-L(2023-08-05) Creating test2.zpaq at offset 0 + 0 Add 2023-09-26 11:00:08 1 4.378.456.467 ( 4.08 GB) 16T (0 dirs) 1 +added, 0 -removed.

0 + (4.378.456.467 -> 4.378.411.447 -> 4.359.222.893) = 4.359.222.893 @ 88.67 MB/s

47.153 seconds (000:00:47) (all OK)

Recompressig the zpaq archives :

[root@neptun ~]# ./zpaq a test3.zpaq /data/vault/Backup3/archive3-zpaq.zpaq 1 +added, 0 -removed.

0.000000 + (4378.304530 -> 4378.259510 -> 4359.089409) = 4359.089409 MB 36.850 seconds (all OK)

[root@neptun ~]# ./zpaq a test4.zpaq /data/vault/Backup3/archive4-zpaq.zpaq

1 +added, 0 -removed.

0.000000 + (4378.304530 -> 4378.259510 -> 4359.089438) = 4359.089438 MB 39.703 seconds (all OK)

Erol-2022 commented 1 year ago

Hello Mr. Corbelli,

I tried the backup option, thanks. I can see the index file and text file storing the hash values. The -715 option does once again the job to speed up the backup process.