Broken multi file archive

fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix

MIT License

259 stars 22 forks source link

Broken multi file archive #96

Closed LukaszBrzyszkiewicz closed 2 months ago

LukaszBrzyszkiewicz commented 4 months ago

I have a daily zpaq file creation. Three times compression process was killed because of too long execution time.

For me this was fine, because I accidentally put big file into backuped folder. Problem occurs that after this point new created archives are somehow invalid and can't currently extract - in theory - created archives without any errors.

Whole archive contains files from 0001.zpaq to 0044.zpaq. One for each day. When I execute zpaqfranz i "brainapp????.zpaq"_ results shows only version from 1 to 26 (27, 28 and 29 version was interrupted by kill command). When I'm trying to extract particular file from 0040.zpaq file I have error: "2 bad frag IDs, skipping..." and after few minutes, zpaq exit and nothing is extracted.

I was trying to trim those three files - in the result "info" command show list up to 44, but still not possible to extract any file.

Any idea what to do next and maybe zpaq should be improved to not fail in such situation?

fcorbelli commented 4 months ago

There is not much that can be done: the zpaq format, for backward compatibility reasons, does not support the possibility of having “holes” (i.e., corrupted archive parts). You can even replace piece 0002.zpaq with piece 0004.zpaq from another archive (!) This is NOT true for chunked zpaqfranz's archive (aka: limit on chunk size)

In zpaqfranz to mitigate (not solve, mitigate) the problem I added the backup command Which works in the same way as part-based archiving BUT maintains an index file that allows you to verify (with the testbackup command), quickly or thoroughly, that all the “pieces” are right

Three times compression process was killed because of too long execution time.

If you want to kill you should try Control-C. This will be intercepted and (hopefully!) some housekeeping done

Of course, it is not possible to prevent “brutal” termination from resulting in data corruption. In the case of a single (i.e., non-multiparty) archive, resilience is assured: at the next execution the hung transaction will be discarded, and the updated archive It is not possible to give 100% certainty, but in general it works well There is also the trim command (specific to zpaqfranz) to discard any portions left “hanging” from an archive

BEWARE: use fullpath with backup command!

TRANSLATION: z:\ugo\apezzi is good, apezzi is NOT good (it is a feature 😄 )

Default hash is MD5, I suggest using -backupxxh3 if you do not need a "manual" MD5 check (aka: heztner storageboxes)

zpaqfranz backup z:\ugo\apezzi c:\zpaqfranz -backupxxh3
zpaqfranz backup z:\ugo\apezzi c:\nz -backupxxh3
zpaqfranz backup z:\ugo\apezzi c:\1200 -backupxxh3

In this example you'll get

Z:\ugo>dir .
 Il volume nell'unità Z è RamDisk
 Numero di serie del volume: 8ABB-DDB8

 Directory di Z:\ugo

30/04/2024  16:18    <DIR>          .
30/04/2024  16:18    <DIR>          ..
30/04/2024  16:18         2.745.144 apezzi_00000000_backup.index
30/04/2024  16:18               398 apezzi_00000000_backup.txt
30/04/2024  16:17       743.804.106 apezzi_00000001.zpaq
30/04/2024  16:18       832.404.177 apezzi_00000002.zpaq
30/04/2024  16:18     1.429.415.084 apezzi_00000003.zpaq
               5 File  3.008.368.909 byte

Now test quick (not very realiable)

zpaqfranz testbackup z:\ugo\apezzi

Corruption test (-ssd for solid state media, on HDD do NOT use!)

zpaqfranz testbackup z:\ugo\apezzi -verify -ssd

Double check

zpaqfranz testbackup z:\ugo\apezzi -verify -ssd -paranoid

fcorbelli commented 4 months ago

OK, now we corrupt the archive

Z:\ugo>copy z:\ugo\apezzi_00000000_backup.txt z:\ugo\apezzi_00000002.zpaq
Sovrascrivere z:\ugo\apezzi_00000002.zpaq? (Sì/No/Tutti): s
        1 file copiati.

The piece 2 is now KO

Z:\ugo>zpaqfranz testbackup z:\ugo\apezzi
zpaqfranz v59.4c-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-04-29)
franz:testbackup                                _ - command
franz:-hw
====================================================================================================
part0 z:/ugo/apezzi_00000000.zpaq i_filename z:/ugo/apezzi_????????.zpaq
Multipart backup looks good
Loading backupfile... z:/ugo/apezzi_00000000_backup.txt
Rows in backup 00000003 from 00000001 to 00000003
Enabling XXH3 (in reading) hasher
Initial check part <<z:/ugo/apezzi_00000002.zpaq>>
Filesize does not match real 398 vs expected 832.404.177
0.047 seconds (000:00:00) (with errors)

LukaszBrzyszkiewicz commented 4 months ago

Thank your for your answer.

Can you help me with creating proper zpaqfranz arguments sets?

I'm currently using such approach, but it looks like this is error prone and not a good idea for regular backup (real path names are different): zpaqfranz a "/backup/name_????.zpaq" "/source/" -m5 -copy "/secondbackup/" -xxh3 -verbose -not "*.log" -find "__vacuum__" -replace "" -filelist -test

I'm also using second approach for metadata backup which contains many MB and poor compressible data: zpaqfranz a "/backup/meta_????.zpaq" "/metasrc/" -m0 -index "/backup/meta_0000.zpaq" -copy "/secondbackup/" -xxh3 -verbose -not "*.log" -find "__vacuum__" -replace "" -filelist -test ---remove local file, but not meta_0000.zpaq

In general most important thing is:

I need to be sure that previously created file isn't corrupted - maybe I should ALWAYS execute trim for last created archive?
Or I should switch to backup command?

To minimize problems I have also plan to (can you help with building the commands ?):

test to extract last snapshot once in a week
once in a month merge all daily snapshots (is this achievable??) and maybe start from the beginning or use this as a startpoint for data deduplication?
I want also to add maybe some additional data corruption repair archiver (maybe parchive?)

fcorbelli commented 4 months ago

zpaqfranz a "/backup/name_????.zpaq" "/source/" -m5 -copy "/secondbackup/" -xxh3 -verbose -not "*.log" -find "vacuum" -replace "" -filelist -test

-m5 is placebo-level compression, and will try to compress even uncompressable data (until -m4 uncompressabile data is stored). -filelist is not useful, in your case, because it is a non-ADS (NTFS) filesystem -copy is usually for USB drives -xxh3 is to quickly made a verify (the default XXHASH is more then enough)

Then my suggest is just

zpaqfranz a "/backup/name_????.zpaq" "/source/" -not "*.log" -find "__vacuum__" -replace ""

(more on testing in next posts)

fcorbelli commented 4 months ago

I'm also using second approach for metadata backup which contains many MB and poor compressible data: zpaqfranz a "/backup/meta_????.zpaq" "/metasrc/" -m0 -index "/backup/meta_0000.zpaq" -copy "/secondbackup/" -xxh3 -verbose -not "*.log" -find "vacuum" -replace "" -filelist -test ---remove local file, but not meta_0000.zpaq

You really do not need to use -m0, unless you REALLY take encrypted file or higly compressed (.MP4 etc) -m1 will compress (if possible) and not compress at all (if file does not seems... compressable)

For example, making the backup of a TrueCrypt volume a -m0 is appropriate

fcorbelli commented 4 months ago

In general most important thing is: I need to be sure that previously created file isn't corrupted - maybe I should ALWAYS execute trim for last created archive? Or I should switch to backup command?

It depends on whether you want to use multivolume or monolithic archive

For multivolume

I suggest backup. It work just like regular multivolume BUT with a textfile with hashes This make much more faster to check against corruption (AND MISSING PIECES)

AND the testbackup command

For monolithic

I suggest the t (test) command after an add, plus (if you can) -paranoid or the w command (if you have enough ram)

test to extract last snapshot once in a week zpaqfranz t thearchive.zpaq

once in a month merge all daily snapshots (is this achievable??) and maybe start from the beginning or use this as a startpoint for data deduplication? It is doable with the m (merge) command. But it is just pointless

This can be a good example, with a rsync-based remote-cloud backup (aka hetzner storagebox)

Just a snipped, adjust as you like

if [ -d "/monta/nexes_sei_aserver6/rar" ]
then
/bin/date +"%R----------NAS: Directory rar apezzi esiste "  
/usr/local/bin/zpaqfranz backup /monta/nexes_sei_aserver6/apezzi/rambo.zpaq /tank -zfs -key pippo -space
/usr/local/bin/zpaqfranz testbackup /monta/nexes_sei_aserver6/apezzi/rambo.zpaq  -paranoid -ssd -key pippo -big >/tmp/remoto.txt

/usr/local/bin/rsync -I  --exclude "/*.zfs" --append --omit-dir-times --no-owner --no-perms --partial --progress -e "/usr/bin/ssh -p 23 -i /root/script/storagebox_openssh " -rlt "/monta/nexes_sei_aserver6/apezzi/" "storageuser@somehwre.your-storagebox.de:/home/rambo/apezzi/"

ssh -p23 -i /root/script/storagebox_openssh  storageuser@somwehere.your-storagebox.de df -h >>/tmp/remoto.txt

ssh -p23 -i /root/script/storagebox_openssh storageuser@somwhere.your-storagebox.de ls -l  /home/rambo/apezzi/ >>/tmp/remoto.txt

PARTNAME=`/usr/local/bin/zpaqfranz last "/monta/nexes_sei_aserver6/apezzi/rambo_????????"`

echo $PARTNAME
ssh -p23 -i /root/script/storagebox_openssh storageuser@somewhere.your-storagebox.de "md5sum  /home/rambo/apezzi/$PARTNAME" >>/tmp/remoto.txt
/usr/local/bin/zpaqfranz sum /monta/nexes_sei_aserver6/apezzi/$PARTNAME -md5 -pakka -noeta -stdout >>/tmp/remoto.txt
/usr/local/bin/zpaqfranz last2 /tmp/remoto.txt -big >>/tmp/remoto.txt

...somehow SMTP the /tmp/remoto.txt file to yourself...

else
        /bin/date +"%R----------NAS: Directory rar apezzi NON  esiste "
fi

The idea is

make the multipart backup in a NAS, NFS or wherever
send to a rsync server WITHOUT --delete AND WITH --append
computate the MD5 of the last "piece" uploaded to rsync
compare with the very last MD5 in backup's log TXT
send (smtp etc)

I want also to add maybe some additional data corruption repair archiver (maybe parchive?)

There is none in zpaq (more on later)

fcorbelli commented 4 months ago

The "right" way to do the tests depends on whether they are LOCAL or REMOTE. Local are files that you keep on a NAS, secondary hard drive etc. REMOTE are those that you transfer, for example with rsync, to a distant machine

For LOCALS. 1) check that the archive is not corrupted (e.g., because the process was killed in the middle of work). You get this with the command t (test) 2) If you have enough free space then the -paranoid switch (which, however, postulates having write-expendable disks, e.g. RAMDISK or cheap ssd) 3) If you have a lot of RAM the command w 4) In the case of multipart don't use multipart... but the backup command (which is a multipart with index) and testbackup

fcorbelli commented 4 months ago

For REMOTE I put an example above. For WINDOWS machines (or rather NTFS filesystems, or NTFS-like) there are switches -ads to store the CRC-32 of archives (not files, just archives)

LukaszBrzyszkiewicz commented 4 months ago

Thank you very much for your comprehensive answer :)

I will adapt and use your suggestions.

Btw - is it possible to tweak progress ? I'm thinking on two things - first is that progress is stuck almost always on some percents, second thing - I'm parsing output from stdout and converting to cronicle-edge JSON format - but maybe there is a possible way to add such output format (like -pakka)

fcorbelli commented 4 months ago

Btw - is it possible to tweak progress ? I'm thinking on two things - first is that progress is stuck almost always on some percents,

In fact no easily, it is already carefully "tweaked"
Does not change until a different ETA is computed This make much faster update during low-compressions - lot of files to be archived AKA: works well with -m1, -m2, -m3. Not very well with -m4. Not good with -m5 A tradeoff is needed to minimize the output (it takes a long time, slowing down a lot) both for computation but still leave it responsive, whether for small files or giant files

second thing - I'm parsing output from stdout and converting to cronicle-edge JSON format - but maybe there is a possible way to add such output format (like -pakka)

There is the fzf command, not really sure if it is enough
If you want some kind of json output, please give me an example