fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
MIT License
259 stars 22 forks source link

Rollback - remove last version #122

Open kskarlatos opened 3 weeks ago

kskarlatos commented 3 weeks ago

Hello, i have recently learned about this amazing program and testing it successfully for storing my mysqldumps. I would like to ask if it is possible to rollback an archive, just like zpaq seems to do. I accidentally added some files for the last version and would like to remove them and continue as if that did not happen.

fcorbelli commented 3 weeks ago

AFAIK zpaq does NOT allow to remove. With zpaqfranz you can use the command crop

C:\zpaqfranz>zpaqfranz h crop
zpaqfranz v60.6k-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2024-08-11)
CMD   crop        Discard latest version(s)
<>:               Delete queued versions from the archive
                  up to the specified position or -until
                  By default DRY RUN (only test)
<>: -kill         Do a 'wet' (effective) run
<>: -to tiny.zpaq Reduce to tiny.zpaq (safer)
<>: -until X      Discard every versions >X
+ : -maxsize  X   Manually cut at X (RISKY)
+ : -force        Crop in-place (no backup: VERY RISKY!)
    Examples:
Reduce file (dry run, just infos):   crop z:\1.zpaq
Reduce up to version 100:            crop z:\1.zpaq -to d:\2.zpaq -until 100 -kill
Reduce to first 100.000:             crop z:\1.zpaq -to d:\2.zpaq -maxsize 100k -kill
Crop in place (NO BACKUP! RISKY!):   crop z:\1.zpaq -until 2 -kill -force

BTW zpaqfranz can get directly piped mysqldump (zpaq cannot) with -stdin You can even pre-order the block for restoring to mysql (-stdout), but this will makes way bigger archives

fcorbelli commented 3 weeks ago

This is an example

C:\zpaqfranz>zpaqfranz crop versioni.zpaq
zpaqfranz v60.6k-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2024-08-11)
franz:-hw

versioni.zpaq:
4 versions, 20 files, 23.669 bytes (23.11 KB)
---------------------------------------------------------------------------
<  Ver  > <  date  > < time > <    version size    > < Offset (w/out encr)>
V00000001 2022-10-03 15:05:37 [                 948] @                  948
V00000002 2022-10-03 15:06:00 [               1.192] @                2.140
V00000003 2022-10-03 15:06:19 [                 897] @                3.037
V00000004 2022-10-09 15:12:25 [              20.632] @               23.669
no -kill, this is just a dry run

We want to discard the version 4 The safest way is

C:\zpaqfranz>zpaqfranz crop versioni.zpaq -to z:\cropped.zpaq -until 3 -kill
zpaqfranz v60.6k-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2024-08-11)
franz:-to                   <<z:/cropped.zpaq>>
franz:-hw -kill

versioni.zpaq -until 3:
3 versions, 8 files, 3.037 bytes (2.97 KB)
---------------------------------------------------------------------------
<  Ver  > <  date  > < time > <    version size    > < Offset (w/out encr)>
V00000001 2022-10-03 15:05:37 [                 948] @                  948
V00000002 2022-10-03 15:06:00 [               1.192] @                2.140
V00000003 2022-10-03 15:06:19 [                 897] @                3.037
Writing data...
DONE on z:/cropped.zpaq (3.037)
0.079 seconds (00:00:00) (all OK)

And then a good test, just to be sure

C:\zpaqfranz>zpaqfranz t z:\cropped.zpaq -all
zpaqfranz v60.6k-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2024-08-11)
franz:-all                                      4
franz:-hw

z:/cropped.zpaq:
3 versions, 8 files, 3.037 bytes (2.97 KB)
To be checked 27 in 4 files (4 threads)
7.15 stage time       0.02 no error detected (RAM ~4.26 MB), try CRC-32 (if any)
Checking                 4 blocks with CRC-32 (27 not-0 bytes)

CRC-32 time           0.00s
Blocks                  27 (           4)
Zeros                    0 (           0) 0.000000 s
Total                   27 speed 1.687/s (1.65 KB/s)
GOOD            : 00000004 of 00000004 (stored=decompressed)
VERDICT         : OK                   (CRC-32 stored vs decompressed)
0.031 seconds (00:00:00) (all OK)
kskarlatos commented 3 weeks ago

Thanks for your answer and great examples! i am now cropping my backup and testing it. about mysqldumps, i would like to save all dumps in a folder named by the date and save a file for each database. can this be done with pipes?

fcorbelli commented 3 weeks ago

Finally, for mysqldump, you can try something like that

mysqldump -uroot -pthepassword thedatabase |c:\zpaqfranz\zpaqfranz a z:\thedump.zpaq thebackup.sql -stdin

Inside the archive "z:\thedump.zpaq" the dumps will be stored in thebackup.sql file Then you can extract using -until something

fcorbelli commented 3 weeks ago

Therefore you can use a single .zpaq (to store multiple dumps) with pipe

mysqldump -uroot -pthepassword firstdb |c:\zpaqfranz\zpaqfranz a z:\thedump.zpaq first.sql -stdin
mysqldump -uroot -pthepassword seconddb |c:\zpaqfranz\zpaqfranz a z:\thedump.zpaq second.sql -stdin
mysqldump -uroot -pthepassword thirddb |c:\zpaqfranz\zpaqfranz a z:\thedump.zpaq third.sql -stdin
fcorbelli commented 3 weeks ago

Or use a folder tree Suppose you make a ugo folder, with 3 folders inside

Z:\>tree ugo
Elenco del percorso delle cartelle per il volume RamDisk
Numero di serie del volume: EEEB-F5AE
Z:\UGO
├───folderdb1
├───folderdb2
└───folderdb3

Then you write something like that

mysqldump -uroot -pthepassword firstdb >z:\ugo\folderdb1\first.sql
mysqldump -uroot -pthepassword seconddb >z:\ugo\folderdb2\second.sql
mysqldump -uroot -pthepassword thirddb >z:\ugo\folderdb3\third.sql

Now you can make 3 different zpaqs with

c:\zpaqfranz\zpaqfranz a z:\thebackups z:\ugo -home
getting THREE different zpaqs

00001: 0      OK 00:00:00                 1.366 z:/thebackups_folderdb1.zpaq
00002: 0      OK 00:00:00                 1.366 z:/thebackups_folderdb2.zpaq
00003: 0      OK 00:00:00                 1.371 z:/thebackups_folderdb3.zpaq

Keeping different zpaqs (-home option) typically makes sense when working with data from different clients, where when the “foo” client is no longer a client, you just delete the foo.zpaq file and that's it.

fcorbelli commented 3 weeks ago

On Windows I usually takes separated db dump, and a --all-databases too, just in case of a bare-metal restore Of course this really depends on how many db do you have

Restoring one-by-one 3 or 5 is a thing, 200 different dbs is just another work In this case I usually restore --all-database (in an empty virtual machine) then pick whatever I really need I am really lazy and sometimes I do not like to make a lot of awk-kung-fu to extract lines from huge mysql dumps

Dealing with 1GB dump is easy, with a 1TB not so much

Short version: depends on your requirements

kskarlatos commented 3 weeks ago

Thanks again for your detailed answers! I will what works better with my workflow. I guess it isnt possible to make one version with one folder containing many dumps using pipes?

fcorbelli commented 3 weeks ago

I forgot to specify that calling dump files differently (i.e., with date) is useless if you store them inside a zpaq file The version already has the date stored in it, so it is easy to go back to it For example with the i (info) command to quickly identify the version number Or using PAKKA (which was originally born just for mysql backups)

And the answer is yes you can (https://github.com/fcorbelli/zpaqfranz/issues/122#issuecomment-2298613241)

kskarlatos commented 3 weeks ago

Oh great, nice! Thanks again

fcorbelli commented 3 weeks ago

Please, if you want, do not forget to "star" and/or leave a review on https://sourceforge.net/projects/zpaqfranz/

kskarlatos commented 3 weeks ago

Hi, 2 questions about -stdin: mysqldump dbname | zpaqfranz a thedump.zpaq folder/dbname.sql -stdin

1) the db is saved as dbname.sql inside the archive, instead of folder/dbname.sql (I get folder/dbname.sql : No such file or directory)

2) dedupe does not seem to work

please tell me if i should open a new issue

fcorbelli commented 3 weeks ago

1) You cannot use a folder name, just a file name. I'll make in new release a warning. Or maybe not The folder is discarded. You can see with -verbose REBUILDING STDIN filename to dump.sql from folder\dump.sql

C:\zpaqfranz>mysqldump -uroot -p1 zarc |c:\zpaqfranz\zpaqfranz a z:\2 folder\dump.sql -stdin -verbose
zpaqfranz v60.6q-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-08-20)
DETECTED SHA1/2 HW INSTRUCTIONS
franz:-hw -stdin -verbose
Integrity check type: XXHASH64B+CRC-32
REBUILDING STDIN filename to dump.sql
Creating z:/2.zpaq at offset 0 + 0
Stdin Add 2024-08-21 19:51:22         1                  4 (   4.00  B) 32T (0 dirs): -m14
MAX_FRAGMENT 520.192 (507.94 KB)
1 +added, 0 -removed.
                    0 starting size
          860.875.094 data to be added
          860.875.090 after deduplication
          155.885.146 after compression
          155.885.146 total size
Total speed 202.86 MB/s
IO buffer 4.096
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
exp               4 get     860.875.090 dump.sql
expected total_size           860.875.094
hashed   total_size           860.875.090
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
no file errors tracked
Files  added +1
4.078 seconds (00:00:04) (all OK)

2) The deduplicator is active

C:\zpaqfranz>mysqldump -uroot -p1 zarc |c:\zpaqfranz\zpaqfranz a z:\archived dump.sql -stdin
zpaqfranz v60.6q-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-08-20)
franz:-hw -stdin
Creating z:/archived.zpaq at offset 0 + 0
Stdin Add 2024-08-21 19:47:07         1                  4 (   4.00  B) 32T (0 dirs)
1 +added, 0 -removed.

0 + (860.875.603 -> 860.875.599 ->  155.885.098) = 155.885.098  @ 138.63 MB/s
Files  added +1
5.938 seconds (00:00:05) (all OK)

C:\zpaqfranz>mysqldump -uroot -p1 zarc |c:\zpaqfranz\zpaqfranz a z:\archived dump.sql -stdin
zpaqfranz v60.6q-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-08-20)
franz:-hw -stdin

z:/archived.zpaq:
1 versions, 1 files, 155.885.098 bytes (148.66 MB)
Updating z:/archived.zpaq at offset 155.885.098 + 0
Stdin Add 2024-08-21 19:47:16         1                  4 (   4.00  B) 32T (0 dirs)
1 +added, 0 -removed.

155.885.098 + (860.875.603 -> 289.615 ->  155.959.855) = 155.959.855  @ 232.51 MB/s
Files  updated #1
3.531 seconds (00:00:03) (all OK)

Then do something on the db

C:\zpaqfranz>mysql -uroot -p1 zarc
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 7
Server version: 11.1.2-MariaDB mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [zarc]> delete from zarcper limit 1;
Query OK, 1 row affected (0.002 sec)

C:\zpaqfranz>mysqldump -uroot -p1 zarc |c:\zpaqfranz\zpaqfranz a z:\archived dump.sql -stdin
zpaqfranz v60.6q-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-08-20)
franz:-hw -stdin

z:/archived.zpaq:
2 versions, 2 files, 155.959.855 bytes (148.73 MB)
Updating z:/archived.zpaq at offset 155.959.855 + 0
Stdin Add 2024-08-21 19:49:17         1                  4 (   4.00  B) 32T (0 dirs)
1 +added, 0 -removed.

155.959.855 + (860.875.094 -> 5.491.026 ->  156.999.099) = 156.999.099  @ 229.46 MB/s
Files  updated #1
3.593 seconds (00:00:03) (all OK)
kskarlatos commented 3 weeks ago

Thanks for your answer. I think it would be useful to have a foldername included as then i could group my files using that.

About dedup, my zpaq file has about 342 versions of my mysqldumps (daily over about one year).

V00000338 2024-08-18 07:03:00 +00000039 -00000001 -> 13.739.782 V00000339 2024-08-19 07:03:00 +00000039 -00000001 -> 19.408.517 V00000340 2024-08-20 07:03:00 +00000039 -00000001 -> 15.503.969 V00000341 2024-08-21 07:03:00 +00000040 -00000001 -> 17.508.378 V00000342 2024-08-21 20:29:11 +00000039 -00000001 -> 10.977.342

when i add another day normaly (add a folder of mysqldumps) then it uses 10-20MB. but when i add for example the large database using -stdin i get

V00000339 2024-08-19 07:03:00 +00000039 -00000001 -> 19.408.517 V00000340 2024-08-20 07:03:00 +00000039 -00000001 -> 15.503.969 V00000341 2024-08-21 07:03:00 +00000040 -00000001 -> 17.508.378 V00000342 2024-08-21 20:29:11 +00000039 -00000001 -> 10.977.342 V00000343 2024-08-21 20:29:12 +00000001 -00000000 -> 5.341.016.752

fcorbelli commented 3 weeks ago

Windows or Linux for a pre-release test build?

kskarlatos commented 3 weeks ago

Linux

fcorbelli commented 3 weeks ago

60_6s.zip

The attached pre-release should (not tested at all)

  1. No restriction on file name (aka: folders) in -stdin
  2. Better deduplication of -stdin (just as good as zpaq's, only faster)

I make no tests, more urgent jobs to complete

Please let me know

kskarlatos commented 3 weeks ago

Thanks Franco! Just compiled and I am testing right now!

kskarlatos commented 3 weeks ago

I think it works!! old version

$ mariadb-dump --quick --single-transaction --triggers --routines --events homeassistant > homeassistant.sql

$ ls -rw-r--r-- 1 root root 22G Aug 22 16:34 homeassistant.sql -rw-r--r-- 1 root root 11G Aug 22 07:20 mysqlbackups.zpaq

root at fileserver on linux-6.10.3-arch1-2 in /storage/SSD/mysql_backups_gunzipped $ zpaqfranz a mysqlbackups.zpaq homeassistant.sql zpaqfranz v60.5e-JIT-L(2024-07-20)

mysqlbackups.zpaq: 342 versions, 13.075 files, 10.851.416.557 bytes (10.11 GB) Updating mysqlbackups.zpaq at offset 10.851.416.557 + 0 Add 2024-08-22 13:35:14 1 22.579.990.740 ( 21.03 GB) 8T (0 dirs) 1 +added, 0 -removed.

10.851.416.557 + (22.579.990.740 -> 29.542.763 -> 10.859.405.417) = 10.859.405.417 @ 93.26 MB/s Files added +1 230.889 seconds (00:03:50) (all OK)

root at fileserver on linux-6.10.3-arch1-2 in /storage/SSD/mysql_backups_gunzipped $ zpaqfranz i mysqlbackups.zpaq| tail V00000335 2024-08-15 07:03:00 +00000039 -00000001 -> 15.201.874 V00000336 2024-08-16 07:03:00 +00000039 -00000001 -> 15.467.466 V00000337 2024-08-17 07:03:00 +00000039 -00000001 -> 15.919.177 V00000338 2024-08-18 07:03:00 +00000039 -00000001 -> 13.739.782 V00000339 2024-08-19 07:03:00 +00000039 -00000001 -> 19.408.517 V00000340 2024-08-20 07:03:00 +00000039 -00000001 -> 15.503.969 V00000341 2024-08-21 07:03:00 +00000040 -00000001 -> 17.508.378 V00000342 2024-08-22 07:03:00 +00000039 -00000001 -> 17.267.099 V00000343 2024-08-22 13:35:14 +00000001 -00000000 -> 7.988.860 6.454 seconds (00:00:06) (all OK)

$ zpaqfranz crop mysqlbackups.zpaq -force -until 342 -kill -space .... Captcha OK in-place crop done 165.814 seconds (00:02:45) (all OK)

$ cat homeassistant.sql| zpaqfranz a mysqlbackups.zpaq homeassistant.sql -stdin zpaqfranz v60.5e-JIT-L(2024-07-20) franz:-stdin

mysqlbackups.zpaq: 342 versions, 13.075 files, 10.851.416.557 bytes (10.11 GB) Updating mysqlbackups.zpaq at offset 10.851.416.557 + 0 Stdin 1 +added, 0 -removed.

10.851.416.557 + (22.579.990.744 -> 22.579.990.740 -> 16.202.857.790) = 16.202.857.790 @ 77.74 MB/s Files added +1 277.018 seconds (00:04:37) (all OK)

$ zpaqfranz i mysqlbackups.zpaq|tail V00000335 2024-08-15 07:03:00 +00000039 -00000001 -> 15.201.874 V00000336 2024-08-16 07:03:00 +00000039 -00000001 -> 15.467.466 V00000337 2024-08-17 07:03:00 +00000039 -00000001 -> 15.919.177 V00000338 2024-08-18 07:03:00 +00000039 -00000001 -> 13.739.782 V00000339 2024-08-19 07:03:00 +00000039 -00000001 -> 19.408.517 V00000340 2024-08-20 07:03:00 +00000039 -00000001 -> 15.503.969 V00000341 2024-08-21 07:03:00 +00000040 -00000001 -> 17.508.378 V00000342 2024-08-22 07:03:00 +00000039 -00000001 -> 17.267.099 V00000343 2024-08-22 13:56:43 +00000001 -00000000 -> 5.351.441.233 6.467 seconds (00:00:06) (all OK)

(trimmed again)

test version

$ cat homeassistant.sql| /root/zpaqfranz/zpaqfranz a mysqlbackups.zpaq 2024-08-22T16_21_43/homeassistant.sql -stdin zpaqfranz v60.6s-JIT-L(2024-08-22) franz:-stdin

mysqlbackups.zpaq: 342 versions, 13.075 files, 10.851.416.557 bytes (10.11 GB) 2024-08-22T16_21_43/homeassistant.sql: No such file or directory Updating mysqlbackups.zpaq at offset 10.851.416.557 + 0 Stdin

10.851.416.557 + (22.579.990.744 -> 29.542.763 -> 10.859.405.436) = 10.859.405.436 @ 112.44 MB/s Files added +1 191.512 seconds (00:03:11) (all OK)

root at fileserver on linux-6.10.3-arch1-2 in /storage/SSD/mysql_backups_gunzipped $ zpaqfranz i mysqlbackups.zpaq | tail V00000335 2024-08-15 07:03:00 +00000039 -00000001 -> 15.201.874 V00000336 2024-08-16 07:03:00 +00000039 -00000001 -> 15.467.466 V00000337 2024-08-17 07:03:00 +00000039 -00000001 -> 15.919.177 V00000338 2024-08-18 07:03:00 +00000039 -00000001 -> 13.739.782 V00000339 2024-08-19 07:03:00 +00000039 -00000001 -> 19.408.517 V00000340 2024-08-20 07:03:00 +00000039 -00000001 -> 15.503.969 V00000341 2024-08-21 07:03:00 +00000040 -00000001 -> 17.508.378 V00000342 2024-08-22 07:03:00 +00000039 -00000001 -> 17.267.099 V00000343 2024-08-22 14:04:01 +00000001 -00000000 -> 7.988.879 6.521 seconds (00:00:06) (all OK)

root at fileserver on linux-6.10.3-arch1-2 in /storage/SSD/mysql_backups_gunzipped $ zpaqfranz l mysqlbackups.zpaq | tail 2024-08-21 04:22:27 6.151.117 12% + 2024-08-21T07_03_00/information_schema.sql 2024-08-21 04:22:28 3.623.817 14% + 2024-08-21T07_03_00/mysql.sql 2024-08-21 04:22:29 1.200.801 14% + 2024-08-21T07_03_00/mythconverg.sql 2024-08-21 04:22:29 207.394 10% + 2024-08-21T07_03_00/performance_schema.sql 2024-08-21 04:22:30 558.597 12% + 2024-08-21T07_03_00/sys.sql 2024-08-22 17:04:07 22.579.990.740 24% + 2024-08-22T16_21_43/homeassistant.sql

4.691.578.953.082 (4.27 TB) of 4.691.578.957.372 (4.27 TB) in 12.736 files shown 10.859.405.436 compressed Ratio 0.002 <> 7.089 seconds (00:00:07) (all OK)

kskarlatos commented 3 weeks ago

Thanks again Franco! One small fix:
2024-08-22T16_21_43/homeassistant.sql: No such file or directory should not be printed for files coming from stdin.

fcorbelli commented 3 weeks ago

60-6t.zip Be careful to check the extracted file

kskarlatos commented 3 weeks ago

thanks, i think it works perfectly good.

full logs:

[130] $ cat 2024-08-22T16_21_43/homeassistant.sql| /root/zpaqfranz/zpaqfranz a mysqlbackups.zpaq 2024-08-22T16_21_43/homeassistant.sql -comment 2024-08-22T16_21_43 -timestamp 2024-08-22T16_21_43 -stdin zpaqfranz v60.6t-JIT-L(2024-08-22) franz:-comment <<2024-08-22T16_21_43>> franz: -timestamp change from 2024-08-22 16:21:43 => 2024-08-22 16:21:43 franz:-timestamp <<2024-08-22 16:21:43>> Unknown option ignored: 2024-08-22T16_21_43 franz:-comment -stdin

mysqlbackups.zpaq: 342 versions, 13.075 files, 10.851.416.557 bytes (10.11 GB) Updating mysqlbackups.zpaq at offset 10.851.416.557 + 0 Stdin <<2024-08-22T16_21_43>>

10.851.416.557 + (22.579.990.744 -> 29.542.763 -> 10.859.405.898) = 10.859.405.898 @ 112.87 MB/s Files added +1 190.793 seconds (00:03:10) (all OK)

$ zpaqfranz l mysqlbackups.zpaq|tail 2024-08-21 04:22:27 6.151.117 12% + 2024-08-21T07_03_00/information_schema.sql 2024-08-21 04:22:28 3.623.817 14% + 2024-08-21T07_03_00/mysql.sql 2024-08-21 04:22:29 1.200.801 14% + 2024-08-21T07_03_00/mythconverg.sql 2024-08-21 04:22:29 207.394 10% + 2024-08-21T07_03_00/performance_schema.sql 2024-08-21 04:22:30 558.597 12% + 2024-08-21T07_03_00/sys.sql 2024-08-22 21:42:42 22.579.990.740 24% + 2024-08-22T16_21_43/homeassistant.sql

$ zpaqfranz i mysqlbackups.zpaq|tail V00000335 2024-08-15 07:03:00 +00000039 -00000001 -> 15.201.874 V00000336 2024-08-16 07:03:00 +00000039 -00000001 -> 15.467.466 V00000337 2024-08-17 07:03:00 +00000039 -00000001 -> 15.919.177 V00000338 2024-08-18 07:03:00 +00000039 -00000001 -> 13.739.782 V00000339 2024-08-19 07:03:00 +00000039 -00000001 -> 19.408.517 V00000340 2024-08-20 07:03:00 +00000039 -00000001 -> 15.503.969 V00000341 2024-08-21 07:03:00 +00000040 -00000001 -> 17.508.378 V00000342 2024-08-22 07:03:00 +00000039 -00000001 -> 17.267.099 V00000343 2024-08-22 16:21:43 +00000001 -00000001 -> 7.989.341 5.914 seconds (00:00:05) (all OK)

$ /root/zpaqfranz/zpaqfranz t mysqlbackups.zpaq zpaqfranz v60.6t-JIT-L(2024-08-22)

mysqlbackups.zpaq: 343 versions, 13.076 files, 10.859.405.898 bytes (10.11 GB) To be checked 4.691.578.957.372 in 12.735 files (8 threads) 7.15 stage time 1798.47 no error detected (RAM ~128.52 MB), try CRC-32 (if any) Checking 6.367.284 blocks with CRC-32 (4.691.578.957.372 not-0 bytes) Block 05965K 3.84 TB CRC-32 time 309.87s Blocks 4.691.578.957.372 ( 6.367.284) Zeros 0 ( 0) 0.000000 s Total 4.691.578.957.372 speed 15.140.474.900/s (14.10 GB/s) GOOD : 00012735 of 00012735 (stored=decompressed) VERDICT : OK (CRC-32 stored vs decompressed) 2108.343 seconds (00:35:08) (all OK)

Only one question: i am using timestamp like this: -timestamp 2024-08-22T16_21_43 timestamps are set, but i get: Unknown option ignored: 2024-08-22T16_21_43

fcorbelli commented 3 weeks ago

60_6u.zip You can try this untested pre-release

kskarlatos commented 3 weeks ago

I think timestamp now works perfectly!

zpaqfranz a /_databases/mysqlbackups.zpaq 2024-08-23T19_46_29/* -comment 2024-08-23T19_46_29 -timestamp 2024-08-23T19_46_29 zpaqfranz v60.6u-JIT-L(2024-08-23) franz:-comment <<2024-08-23T19_46_29>> franz:-timestamp <<2024-08-23 19:46:29>> franz:-comment

/_databases/mysqlbackups.zpaq: 281 versions, 465 files, 6.402.773.446 bytes (5.96 GB) Updating /_databases/mysqlbackups.zpaq at offset 6.402.773.446 + 0 Add 2024-08-23 19:46:29 93 79.025.430.401 ( 73.60 GB) 32T (0 dirs)<<2024-08-23T19_46_29>> 93 +added, 0 -removed.

6.402.773.446 + (79.025.430.401 -> 1.121.444.426 -> 6.459.617.189) = 6.459.617.189 @ 198.85 MB/s Files added +93 379.005 seconds (00:06:19) (all OK)

question: is there a way to see how much space a file takes inside an archive, like zpaqfranz i does for versions? zpaqfranz l does not have such an option, it only shows the pre-compressed size

fcorbelli commented 3 weeks ago

Newer zpaqfranz already shows extimated compression ratio with the l list command for every file Even Red green Yellow

BTW the -stat switch for i command show a good extimation, albeit slow (for huge archives and many version)

kskarlatos commented 3 weeks ago

(sorry for the bad formatting)

I think it would be quite helpful if the -stat output for l switch would work the same for the i switch, and showed the uncompressed size of each file in bytes.

Also, the i switch only shows the first 10.000 versions, is there a way to list them all?

╰─ zpaqfranz i fileserver_recompress.zpaq zpaqfranz v60.6u-JIT-L(2024-08-23) franz:-nocolor

fileserver_recompress.zpaq: 12275 versions, 12.275 files, 10.669.410.908 bytes (9.94 GB)

< Ver > < date > < time > < added > < bytes added >

V00000001 2023-07-10 07:03:01 +00000001 -00000001 -> 9.616 V00000002 2023-07-10 07:03:02 +00000001 -00000001 -> 9.633 . . . V00009997 2024-05-20 07:03:37 +00000001 -00000001 -> 13.890 V00009998 2024-05-21 07:03:00 +00000001 -00000001 -> 9.696 V00009999 2024-05-21 07:03:01 +00000001 -00000001 -> 9.593 4.696 seconds (00:00:04) (all OK)

╰─ zpaqfranz i fileserver_recompress.zpaq -range ::50 zpaqfranz v60.6u-JIT-L(2024-08-23) franz:-range ::50

fileserver_recompress.zpaq: Incomplete transaction ignored4 (block/s) 12252 versions, 12.252 files, 10.653.011.660 bytes (9.92 GB)

< Ver > < date > < time > < added > < bytes added >

4.206 seconds (00:00:04) (all OK)

╰─ zpaqfranz i fileserver_recompress.zpaq -range 10050 zpaqfranz v60.6u-JIT-L(2024-08-23) franz:-range 10050 franz:rangefrom (version) 10.050 franz:rangeto (version) 10.050

fileserver_recompress.zpaq: 12255 versions, 12.255 files, 10.669.153.957 bytes (9.94 GB)

< Ver > < date > < time > < added > < bytes added >

4.479 seconds (00:00:04) (all OK)

also, about dates: when i add a file with stdin,i get this warning and the version timestamp is not set to what i want (where does it find the other time?)

zpaqfranz v60.6u-JIT-L(2024-08-23) franz:-comment <<2023-07-11T07_03_01/sys.sql>> franz:-timestamp <<2023-07-11 07:03:01>> franz:-summary 1 franz:-comment -nocolor -noeta -stdin

/backups/fileserver_recompress.zpaq: 73 versions, 73 files, 1.588.239.586 bytes (1.48 GB) Updating /backups/fileserver_recompress.zpaq at offset 1.588.239.586 + 0 Stdin <<2023-07-11T07_03_01/sys.sql>> Warning: adjusting date from 2023-07-11 07:03:01 to 2023-07-11 07:03:37

1.588.239.586 + (4 -> 63.673 -> 1.588.247.490) = 1.588.247.490 @ 102.00 B/s Files added +1 0.039 seconds (00:00:00) (all OK)

╰─ zpaqfranz i fileserver_recompress.zpaq -stat |grep 2023-07-11T07_03_01/sys.sql V00000074 2023-07-11 07:03:37 +00000001 -00000001 281.562 [ 7.904] <<2023-07-11T07_03_01/sys.sql>>

also, the file does not get that timestamp assigned to it, is there a way to set a custom timestamp for a file (not a version)?

╰─ zpaqfranz l fileserver_recompress.zpaq -stat |grep 2023-07-11T07_03_01/sys.sql 2024-08-24 02:10:04 281.562 8% + 2023-07-11T07_03_01/sys.sql

fcorbelli commented 3 weeks ago

I think it would be quite helpful if the -stat output for i switch would work the same for the l switch

Mmmhhh... This will take a lot of space in output, without a really useful information (it is an EXTIMATION, due to very complex zpaq archiving method)

Also, the l switch only shows the first 10.000 versions, is there a way to list them all?

It is an hardcoded all=4. I can upscale to 5 (99.999)

also, about dates: when i add a file with stdin,i get this warning and the version timestamp is not set to what i want (where does it find the other time?)

You cannot make a transaction IN THE PAST, but only IN THE FUTURE

also, the file does not get that timestamp assigned to it, is there a way to set a custom timestamp for a file (not a version)?

Yes, but... why?

kskarlatos commented 3 weeks ago

I think it would be quite helpful if the -stat output for i switch would work the same for the l switch

Mmmhhh... This will take a lot of space in output, without a really useful information (it is an EXTIMATION, due to very complex zpaq archiving method)

I just want to know how much space my backups take! I am trying to decide which is the most efficient and closer to the way i work way for me to store my mysqldumps. (will write more about that in a bit)

Also, the l switch only shows the first 10.000 versions, is there a way to list them all?

It is an hardcoded all=4. I can upscale to 5 (99.999)

OK. Things like this should be documented in the h section, and in my opinion, they should be able to be changed by the user (also a nitpick, i would also like to have an option to trim an archive without writing the captcha)

also, about dates: when i add a file with stdin,i get this warning and the version timestamp is not set to what i want (where does it find the other time?)

You cannot make a transaction IN THE PAST, but only IN THE FUTURE

then what is the use of -timestamp? and in any case it changed the 2023-07-11 07:03:01 to 2023-07-11 07:03:37, where both are in the past.

also, the file does not get that timestamp assigned to it, is there a way to set a custom timestamp for a file (not a version)?

Yes, but... why?

I am storing the whole history of my mysqldumps in one zpaq file and would like to have the original file date stored.

fcorbelli commented 3 weeks ago

I just want to know how much space my backups take! I am trying to decide which is the most efficient and closer to the way i work way for me to store my mysqldumps. (will write more about that in a bit)

It is just the filesize of the archive

Also, the l switch only shows the first 10.000 versions, is there a way to list them all?

Setting on 8 is enough

int Jidac::info()
{
    ///flagcomment=true;
    versioncomment="";
    all=8;
    return enumeratecomments();
}

(also a nitpick, i would also like to have an option to trim an archive without writing the captcha)

Surely NOT Altering an archive is quite a risky thing to do. zpaq works only ADDING data to the archive Trimming is way too dangerous, you can lose all of your data very quickly If you KNOW what you are doing, you can do anything

also, the file does not get that timestamp assigned to it, is there a way to set a custom timestamp for a file (not a version)?

Past and future is... the timestamp of the very last version is "current". Everything BEFORE is past (not good). Everything after is FUTURE (that's good) The versions MUST BE monotonically increasing You cannot have version X timestamp 27 AND version X+1 timestamp 27 (or 26, or 3) Must be (at least) 27+1

I am storing the whole history of my mysqldumps in one zpaq file and would like to have the original file date stored.

Put the filedate in the filename Or extract them all, touch them all, insert them all

Altering the timestamp of an already archived file is impossible

In fact it is not really impossible in every case, but way too complex, too risky, too everything

If you have the file "pippo.txt" archived with datetime 27, you cannot change to datetime 28 A brand new transaction is needed to write down newer i-blocks There is NOT an "index" of archived files in zpaq that you can somehow alter

kskarlatos commented 3 weeks ago

I just want to know how much space my backups take! I am trying to decide which is the most efficient and closer to the way i work way for me to store my mysqldumps. (will write more about that in a bit)

It is just the filesize of the archive

i mean how much space each file takes :)

Also, the l switch only shows the first 10.000 versions, is there a way to list them all? Setting on 8 is enough

int Jidac::info()
{
  ///flagcomment=true;
  versioncomment="";
  all=8;
  return enumeratecomments();
}

-all gets accepted, but does not change anything:

╰─ zpaqfranz i fileserver_recompress.zpaq -stat -all 8 | head zpaqfranz v60.6u-JIT-L(2024-08-23) franz:-all 8 franz:-nocolor -stat

fileserver_recompress.zpaq: 12581 versions, 12.581 files, 10.877.950.693 bytes (10.13 GB)

< Ver > < date > < time > < added > < uncompressed > < compressed >

V00000001 2023-07-10 07:03:01 +00000001 -00000001 72.156 [ 9.616] <<2023-07-10T07_03_01/AnnasVideos116.sql>> ... V00009999 2024-05-21 07:03:01 +00000001 -00000001 72.377 [ 9.593] <<2024-05-21T07_03_00/AnnasVideos119.sql>> 4.778 seconds (00:00:04) (all OK)

(also a nitpick, i would also like to have an option to trim an archive without writing the captcha)

Surely NOT Altering an archive is quite a risky thing to do. zpaq works only ADDING data to the archive Trimming is way too dangerous, you can lose all of your data very quickly If you KNOW what you are doing, you can do anything

Yes i know, i just wanted to use if for scripting and do not want to pipe the captcha. Maybe a --force-i-am-sure-i-will-lose-data switch :)

also, the file does not get that timestamp assigned to it, is there a way to set a custom timestamp for a file (not a version)?

Past and future is... the timestamp of the very last version is "current". Everything BEFORE is past (not good). Everything after is FUTURE (that's good) The versions MUST BE monotonically increasing You cannot have version X timestamp 27 AND version X+1 timestamp 27 (or 26, or 3) Must be (at least) 27+1

aah ok i understand now.

I am storing the whole history of my mysqldumps in one zpaq file and would like to have the original file date stored. Put the filedate in the filename Or extract them all, touch them all, insert them all Altering the timestamp of an already archived file is impossible If you have the file "pippo.txt" archived with datetime 27, you cannot change to datetime 28 A brand new transaction is needed to write down newer i-blocks There is NOT an "index" of archived files in zpaq

no, i just want to set the creation date for a piped file before it is saved in the archive.

fcorbelli commented 3 weeks ago

i mean how much space each file takes :)

You can't Because this information does not exists in zpaq You have to run something very slow, very lengthy, to take this information In previous post I already explained "why". But the short version is: there is NOT I can store an exact number, but this will require a LOT of complex work => not gonna happen

-all gets accepted, but does not change anything:

It is hardcoded to 4. Changing the source code is necessary

no, i just want to set the creation date for a piped file before it is saved in the archive.

It is doable, but not really a thing

kskarlatos commented 3 weeks ago

i mean how much space each file takes :)

You can't Because this information does not exists in zpaq You have to run something very slow, very lengthy, to take this information In previous post I already explained "why". But the short version is: there is NOT I can store an exact number, but this will require a LOT of complex work => not gonna happen

ok i understand

-all gets accepted, but does not change anything:

It is hardcoded to 4. Changing the source code is necessary

Ok will do.

no, i just want to set the creation date for a piped file before it is saved in the archive.

It is doable, but not really a thing

Ok. its just a nice to have thing, not really needed or anything, so no worries. In any case zpaqfranz is amazing, thank you again for your dedication and excellent work here!

fcorbelli commented 3 weeks ago

60_6w.zip In this pre-release you can use -touch

type 1.txt|zpaqfranz a z:\2.zpaq 1.sql -stdin -touch 2024_02_03
kskarlatos commented 3 weeks ago

It works perfectly! thanks for changing to all=8 :)

btw crop doesnt work for over 9.999

zpaqfranz crop fileserver_recompress.zpaq -until 12581 -kill -force .. -until 12.581 too big (> maxversion 9.999) => exit

fcorbelli commented 3 weeks ago

Another hardcoded 4

    vector<uint64_t> version_position;
    version_position.push_back(0);
    all                 =4;
    int64_t wheretotrim =0;
    int errors          =0;
    int64_t csize=read_archive(NULL,archive.c_str(),&errors,1);
kskarlatos commented 3 weeks ago

thanks, just recompiled and it works perfectly.

kskarlatos commented 3 weeks ago

About my mysqldumps. I backup to one file per database and one file containing all databases via mariabackup --backup --stream=xbstream

I have discovered 2 different ways to use zpaqfranz:

1) mysqldump all db files to a folder, (named by date, ie 2024-08-24T19_15_51) and then add this folder to zpaqfranz. this minimizes the number of versions, adding to the archive is pretty quick and i can store everything neatly in one place. The problem with this method is that the dumped databases cannot use compression, so they take tons of space and make a huge amount of writes to the SSD (50-100GB each time). This can be minimized by using a compressed ramdisk, but it still needs quite a bit of RAM.

2) use pipes Using -stdin is very fast, does not need large amounts of RAM and in general is pretty neat (thanks again for fixing the minor issues i found!) The only problem for my use is that every db must be stored as a different version. This quickly adds to > 10.000 versions, so adding a new file takes a long time (that keeps increasing) and i think this will be more and more problematic as i add databases.

Is there a way to pipe multiple files to a single version? I have no idea about how zpaq works, but maybe start zpaqfranz, create a new version, store the first file via pipe but leave the version "open", then add as many files are needed to the same version, and when finised "close/lock" that version. Or maybe some other way.

If that is possible then it will be a dream come true! Thanks again for your time and for creating this amazing program!

kskarlatos commented 3 weeks ago

another possible way is to allow multiple files to be added as arguments via -stdin, and then use named pipes, or process substitution.

kskarlatos commented 3 weeks ago

can something like this be made to work? (i havent managed to make anything yet)

zpaqfranz a mysqlbackups.zpaq <(mysqldump db1) <(mysqldump db2)

or mkfifo db1.sql mkfifo db2.sql mysqldump db1 > db1.sql mysqldump db2 > db2.sql zpaqfranz a mysqlbackups.zpaq db1.sql db2.sql

fcorbelli commented 3 weeks ago

Using a giant .zpaq to store everything it is not a good idea, much better to store every database in a different .zpaq with 2 methods for each db, plus a "grand total" (aka: the doomsday zpaq)

1) mydumper This will make different files for different tables. Much easy to restore for a single table 2) the mysqldump |

3) mariabackup --backup 4) mysqldump --all-databases

Because remember: you cannot DELETE something. Then yours db will live forever. With different files you can quickly delete the db "myvideoarchive" by removing "myvideoarchive.zpaq"

You can even use a zfs-deduplicated storage for temporary backups, the put inside zpaqfranz. This will greatly reduce the write tear I do this way for full .vmdk backups

fcorbelli commented 3 weeks ago

And no, you cannot discriminate stdin into files Because stdin is not a file, it is just a stream of bytes, one after another Yes it is called stdin, seems a file, have a file descriptor, a file handle But this is not a "file" (for example, does not have a length)

fcorbelli commented 3 weeks ago

BTW you can -freeze a zpaq and start again, automagically, if you really want a single big archive

kskarlatos commented 3 weeks ago

Using a giant .zpaq to store everything it is not a good idea, much better to store every database in a different .zpaq with 2 methods for each db, plus a "grand total" (aka: the doomsday zpaq)

1. mydumper
   This will make different files for different tables.
   Much easy to restore for a single table

easy to restore tables is interesting, thanks for bringing it to my attention!

2. the mysqldump |

3. mariabackup --backup

4. mysqldump --all-databases

so you make 1 zpaq with 1 and 2 for each db, and one zpaq with 3 and 4 for everything? so for n databases you have n+1 zpaqs?

Because remember: you cannot DELETE something. Then yours db will live forever. With different files you can quickly delete the db "myvideoarchive" by removing "myvideoarchive.zpaq"

In my use case i dont care about deleting a single db. making everything take less space (and have fast backups) is a bigger priority. with one file for every database i also get the dedup gain when many dbs are the same or similar.

You can even use a zfs-deduplicated storage for temporary backups, the put inside zpaqfranz. This will greatly reduce the write tear I do this way for full .vmdk backups

I havent yet explored the zfs stuff of zpaqfranz, i will take a look when i can. for my msqldumps, a btrfs fs with zstd:10 mounted on a zram seems to work quite well.

kskarlatos commented 3 weeks ago

And no, you cannot discriminate stdin into files Because stdin is not a file, it is just a stream of bytes, one after another Yes it is called stdin, seems a file, have a file descriptor, a file handle But this is not a "file" (for example, does not have a length)

maybe multiple stdin then? something like: zpaqfranz a myfile.zpaq -filename db1.sql <(mysqldump db1) -filename db2.sql <(mysqldump db2) - stdinmultiple