fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
MIT License
259 stars 22 forks source link

Backup command index path #109

Closed sheckandar closed 2 months ago

sheckandar commented 2 months ago

First of all, I would like to thank you for this project. I've been using original zpaq for many years now and was glad to find your upgraded version with many useful features.

As I started testing zpaqfranz, I noticed that the backup command creates an index file and a hash file in the same directory as the archive file, however, the add command allows me to specify a path to an index file. Does that limitation have a technical explanation ? Or did I overlook a potion of the WiKi on how to do that ?

Our backup archives are stored in a B2 bucket and are locked for a period of time for protection and compliance purposes. This makes it impossible to append any data to any file in the bucket, only create new files.

So as you can see the current backup command cannot be used with such a setup.

I was wondering if you could add the ability to set a path for the index and checksum files.

fcorbelli commented 2 months ago

Sure, it is a suggestion I can implement I'll change the .pid file too

fcorbelli commented 2 months ago

You can try the attached pre-release, using -index to specify "where" to write the data

BEWARE: putting index files in other folder will weaken the test!

zpaqfranz backup z:\ugo\prova *.cpp
zpaqfranz backup z:\ugo\prova *.txt -index c:\temp

this seems good, but it is broken

zpaqfranz testbackup z:\ugo\prova

you need something like this

zpaqfranz testbackup z:\ugo\prova -index c:\temp -paranoid

You should do

zpaqfranz backup z:\ugo\prova *.cpp -index c:\temp
zpaqfranz backup z:\ugo\prova *.txt -index c:\temp
zpaqfranz testbackup z:\ugo\prova -index c:\temp

1

59_9a.zip

=>Take care to pair .zpaq files with the correct indexes

Maybe I will add more heuristic checks in the future

fcorbelli commented 2 months ago

OK, this is really interesting

The same source code, compiled on two different versions of gcc, runs in a different way Digging underway...

sheckandar commented 2 months ago

Just tested the new feature and it works as expected. Thank you.


I'm not sure about the issue with gcc compiler that you mentioned above.

I'm compiling on RedHat with gcc v8.5 and everything seems to work for me.

Let me know if you would like me to test something for you.

graphixillusion commented 2 months ago

For me this command doesn't work.

zpaqfranz backup backup\ *.mkv -index c:\Temp\
zpaqfranz v59.9b-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-06-21)
franz:-index                             c:/Temp/
*** WARNING: It's YOUR job to preserve _backup.index and _backup.txt! ***
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
part0 backup/_00000000.zpaq i_filename backup/_????????.zpaq
Multipart backup seems OK
part0 backup/_00000000.zpaq i_filename backup/_????????.zpaq

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?
0.110 seconds (00:00:00) (with warnings)
sheckandar commented 2 months ago

I think you have a syntax error. Assuming you want to back up the "backup" folder in the current directory and name the zpaq archive backup_00000001.zpaq, the following syntax would be appropriate:

zpaqfranz backup "backup" "backup\" *.mkv -index "c:\Temp\"

Double quotes are required for all paths as far as I know.

Edit:

After I took a look at the log file you posted again, I think this is what would work for you:

zpaqfranz backup "backup" *.mkv -index "c:\Temp\"

graphixillusion commented 2 months ago

Nope. Doesn't work.

zpaqfranz backup "backup" *.mkv -index "c:\Temp"
zpaqfranz v59.9b-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-06-21)
franz:-index                              c:/Temp
*** WARNING: It's YOUR job to preserve _backup.index and _backup.txt! ***
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
part0 ./backup_00000000.zpaq i_filename ./backup_????????.zpaq
Multipart backup seems OK
part0 ./backup_00000000.zpaq i_filename ./backup_????????.zpaq

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?
0.109 seconds (00:00:00) (with warnings)
zpaqfranz backup "backup" *.mkv -index "c:\Temp\"
zpaqfranz v59.9b-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-06-21)
franz:-index                             c:/Temp"
the folder (of -index) need to (already) exists c:/Temp"/
0.031 seconds (00:00:00) (with errors)
sheckandar commented 2 months ago

Do you have any mkv files in the path in which zpaqfranz is being executed ?

If you execute zpaqfranz find *.mkv, does it list any mkv files ?

graphixillusion commented 2 months ago

Ah ok, now i have understood the logic. I thought that the command was structured like this:

zpaqfranz backup (command) backup\ (source folder to backup) *.mkv (filter the type of files to backup in the source folder) -index c:\Temp

Now i get that backup\ is the target folder where the backup will be stored and zpaqfranz wants the files to backup in the current folder. Now it works as expected.

sheckandar commented 2 months ago

Almost. zpaqfranz syntax is quite flexible. You can simply specify full paths and back up any folder.

For example, let's assume I have my mkv files in C:\Videos, then the following syntax can be used:

zpaqfranz backup (the command) "C:\Backups\matroska_files" (path to directory where to save the archive and its name) "C:\Videos" (path to the folder to backup) *.mkv (filter) -index "C:\Temp" (path to directory where to save the index file)

The result will be an archive with path C:\Backups\matroska_files_00000001.zpaq which will only contain matroska files from C:\Videos, index file with path C:\Temp\matroska_files_00000000_backup.index and the hash file with path C:\Temp\matroska_files_00000000_backup.txt

Hope that makes sense.

fcorbelli commented 2 months ago

In fact, it might be better to specify

zpaqfranz the command sequence of files/folder/wildcards to be added theswitches
zpaqfranz a z:\1.zpaq c:\myfirstfolder d:\thesecond *.cpp e:\thethird

This will a (add) in z:\1.zpaq 3 folders AND every .cpp files in the current directory

zpaqfranz a z:\1.zpaq c:\zpaqfranz -only *.exe

This will add only .exe files (from the c:\zpaqfranz folder) inside the z:\1.zpaq archive

zpaqfranz a z:\1.zpaq c:\zpaqfranz -only *.exe -only *.cpp

This will add only .exe and .cpp files (from the c:\zpaqfranz folder) inside the z:\1.zpaq archive

zpaqfranz a z:\1.zpaq c:\zpaqfranz -not *.exe -not *.zip

This will everything EXCEPT .exe and .zip

fcorbelli commented 2 months ago

Therefore

zpaqfranz backup z:\thebackup.zpaq *.cpp

will backup all *.cpp (in the current folder)

zpaqfranz backup z:\thebackup.zpaq c:\nz

will take everything inside c:\nz

zpaqfranz backup z:\thebackup.zpaq d:\pluto e:\paperino -longpath

Will store d:\pluto and e:\paperino folder, with support for >255 paths

fcorbelli commented 2 months ago

BTW the backup command, with -index, only makes sense in limited cases (if you know what you are doing). The best choice is backup and that's it. The backup command creates a multipart archive 'reinforced' with additional controls. Multipart means that each execution creates an additional file, numbered progressively

I leave a few examples

zpaqfranz a z:\1.zpaq *.cpp
zpaqfranz a z:\1.zpaq *.bat
zpaqfranz a z:\1.zpaq *.txt

This will make ONE archive (1.zpaq) with 3 versions inside

zpaqfranz a z:\2_????.zpaq *.cpp
zpaqfranz a z:\2_????.zpaq *.bat
zpaqfranz a z:\2_????.zpaq *.txt

This will make THREE files (2_0001.zpaq, 2_0002.zpaq, 2_0003.zpaq) each with one version. BTW: using 4 ? => will create _0001, 0002... You can use more (for example 8 ???????? => 00000001, 00000002...)

zpaqfranz backup z:\3.zpaq *.cpp
zpaqfranz backup z:\3.zpaq *.bat
zpaqfranz backup z:\3.zpaq *.txt

This will make FIVE files

3_00000000_backup.index

zpaq's index file

3_00000000_backup.txt

zpaqfranz's index file 3_00000001.zpaq, 3_00000002.zpaq,3_00000003.zpaq

fcorbelli commented 2 months ago

With archives (single or multipart) AND backups you can use the t (test) command Beware of wildcards length (4 ?, or 8 ? in this example)

zpaqfranz t z:\1.zpaq
zpaqfranz t z:\2_????.zpaq
zpaqfranz t z:\3_????????.zpaq

With backups you can use the command testbackup

zpaqfranz testbackup z:\3.zpaq
zpaqfranz testbackup z:\3.zpaq -verify
zpaqfranz testbackup z:\3.zpaq -verify -ssd
zpaqfranz testbackup z:\3.zpaq -paranoid
zpaqfranz testbackup z:\3.zpaq -paranoid -verify -ssd
fcorbelli commented 2 months ago

Why the backup command? Because multipart archives are more fragile than single-file archives.

Suppose you have 5 different parts, the sequence foo_0001.zpaq, foo_0002.zpaq, foo_0003.zpaq, foo_0004.zpaq, foo_0005.zpaq then you delete/change/corrupt (for example) foo_0004.zpaq

Using the command t (test) will check parts 1, 2 and 3, and stop (as 4 is missing), saying that everything is OK.

If you created the archive with backup, you can check it with testbackup: you won't restore lost data, but you will know there is a problem. Otherwise you will think your backup is perfect, but it is not, and you cannot know.

fcorbelli commented 2 months ago

Finally, the best way to familiarise yourself is... read the manual (!) or even better watch the examples

Use

zpaqfranz h h

to get a list of commands (yes, it is h (help) on h (help)) 1

If you want to see of backup command, run a

zpaqfranz h backup

3

If you want to see testbackup

zpaqfranz h testbackup

4

The examples cover all common cases and are added in new releases For example, if you want to see what has changed for the command a (add) you will do

zpaqfranz h a

If you see examples that you are not familiar with, it means that some functions have been added

Sometimes switches are cryptic, I realise, such as -backupxxh3 in the backup command If in doubt... you can always ask 😄

PS -backupxxh3 means: use a faster hash algorithm, XXH3, instead of the default one (MD5). The default one is here because it is compatible with Unix systems such as hetzner's storagebox. If you use Windows machines (with SSD or NVMe) -backupxxh3 is much faster.

sheckandar commented 2 months ago

The main issue is resolved. Closing.