fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
MIT License
260 stars 22 forks source link

What do those numbers mean? #59

Closed ambanmba closed 1 year ago

ambanmba commented 1 year ago

When running a job, the macOS version shows this, for example:

Add 2023-07-08 23:59:33 875.279 612.179.971.835 ( 570.14 GB) 12T (103.156 dirs) Long filenames (>255) 67.786 (031%) 84.64% 04:34:11 ( 482.57 GB)->( 93.80 GB)=>( 110.82 GB) 5.45 MB/sec

Would be nice to know what all those numbers mean. Some you can figure out (12 Threads, 5.45 MB/sec) but what about the 482.57GB->93.8GB=>110.82GB part?

fcorbelli commented 1 year ago

You are right, an explanation can be helpful
By default there are 2/3 lines The "first" Add 2023-07-08 23:59:33 875.279 612.179.971.835 ( 570.14 GB) 12T (103.156 dirs)

The second (optional) Long filenames (>255) 67.786

 (031%)  84.64% 04:34:11  ( 482.57 GB)->(  93.80 GB)=>( 110.82 GB)    5.45 MB/sec

You can change the output with -verbose -pakka -noeta -silent -debug

BTW you speed seems slow, maybe you are reading from a Samba-powered NAS or using a placebo-level compression, like -m5?
Don't
Working on big archives (say >>10GB) I suggest no change in compression level (leave the default, -m1). Sometimes you can use -m2, don't go higher
If you read (on Mac) from a Linux-based NAS (for example Qnap or whatever) do NOT use the Windows sharing (the SMB service), prefer NFS (much faster). Of course you need to activate nfs share, mount on Mac etc.

I would very much appreciate any suggestions

ambanmba commented 1 year ago

Thanks for the detailed description. I was just having a play with different settings and yes, -m5 is crazy slow.

This is all being done on a locally attached SSD so no NFS, but good advice on NFS vs. SMB.

How do you calculate the projected? Is it based on the average compression ratio so far elaborated? Not quite sure how you can make it displayed better, maybe: Total: XXXX -> Progress: XXXX => Compressed: XXX => Projected: XXX

fcorbelli commented 1 year ago

I highly suggest no -m5, just the default (-m1)

Projected size is a linear projection, nothing really smart
I use it essentially to get a reasonable estimate on the filling of the destination path

It is not easy to find a way to pack in ~70 chars as much infos as possible (for a single line) I can do multi-line, BUT this will "mess up" scripting output

My idea is to have feedback, as much as possible, that is useful during executions

For example during -image, or -stdin, the infos are different
This is the best I have been able to think of so far, but there is room for improvement

For example, the update is done every time there is a change in the ETA. In other cases it is done every second If an archiving takes a short time, every second is better. If it lasts a long time, every ETA change is better

I personally think that "my" advancement is definitely better than RAR or 7z or zpaq, but that's just an opinion

fcorbelli commented 1 year ago

I close, to but ready for any suggestions (remember, for one line only)