fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
MIT License
259 stars 22 forks source link

Windows x/extract Creates Subfolders When Using -longpath #86

Open EpicGazel opened 9 months ago

EpicGazel commented 9 months ago

Unlike how the e command extracts (or at least claims to) properly with longpath on Windows, the x command does not when using -longpath. Instead, subfolders are created starting after the path specified in -to. Thus, extracting G:\.minecraft\screenshots -to "B:\screenshots" goes to "B:\screenshots\G_\.minecraft\screenshots" instead of the expected "B:\screenshots".

Normal extract without -longpath:

> zpaqfranz.exe x .\g_drive.zpaq "G:\.minecraft\screenshots" -to "B:\normal\screenshots"
zpaqfranz v58.11z-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-11-10)
franz:-hw
./g_drive.zpaq:
4 versions, 99.499 files, 21.171.493.883 bytes (19.72 GB)
Long filenames (>255)         12 *** WARNING *** (suggest -longpath or -fix255 or -flat)
Extract 28.880.896 bytes (27.54 MB) in 39 files (1 folders) / 16 T
        28.69% 00:00:00  (   7.90 MB)=>(  27.54 MB)    7.90 MB/sec

0.500 seconds (000:00:00) (all OK)
> tree /F B:\normal
Folder PATH listing for volume Backup
Volume serial number is C264-09CB
B:\NORMAL
└───screenshots
        2019-05-09_21.57.51.png
        2019-05-09_21.58.07.png
        2019-05-16_20.24.57.png
        2019-05-16_22.16.22.png
        2019-05-16_22.36.01.png
        2019-05-16_22.36.02.png
        2019-10-06_22.15.05.png
        2019-10-12_17.00.21.png
        2019-10-12_19.26.00.png
        2019-10-12_19.26.07.png
        2019-10-12_19.26.11.png
        2019-10-12_19.26.29.png
        2019-10-12_19.26.33.png
        2019-10-13_01.48.38.png
        2019-10-13_01.56.28.png
        2020-06-07_06.10.02.png
        2020-06-07_06.16.05.png
        2020-06-07_06.16.08.png
        2020-06-07_06.17.13.png
        2020-06-07_06.29.43.png
        2020-06-07_09.46.08.png
        2020-06-07_09.46.11.png
        2020-06-19_13.31.33.png
        2020-06-23_12.53.30.png
        2020-06-23_13.03.37.png
        2020-07-01_02.41.13.png
        2020-07-01_02.43.33.png
        2020-07-01_03.02.32.png
        2020-07-01_03.03.14.png
        2020-07-01_03.05.56.png
        2020-07-01_03.30.33.png
        2020-07-01_05.04.23.png
        2020-07-01_06.33.33.png
        2020-07-01_06.34.51.png
        2020-07-01_06.34.58.png
        2020-07-01_06.35.00.png
        2020-11-12_05.26.56.png
        2020-11-13_08.26.09.png
        2020-11-13_08.26.26.png

-longpath extract:

> zpaqfranz.exe x .\g_drive.zpaq "G:\.minecraft\screenshots" -to "B:\longpath\screenshots" -longpath
zpaqfranz v58.11z-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-11-10)
franz:-hw -longpath
31876: INFO: setting Windows' long filenames
./g_drive.zpaq:
4 versions, 99.499 files, 21.171.493.883 bytes (19.72 GB)
Extract 28.880.896 bytes (27.54 MB) in 39 files (1 folders) / 16 T
        28.69% 00:00:00  (   7.90 MB)=>(  27.54 MB)    7.90 MB/sec

0.516 seconds (000:00:00) (all OK)
> tree /F B:\longpath
Folder PATH listing for volume Backup
Volume serial number is C264-09CB
B:\LONGPATH
└───screenshots
    └───G_
        └───.minecraft
            └───screenshots
                    2019-05-09_21.57.51.png
                    2019-05-09_21.58.07.png
                    2019-05-16_20.24.57.png
                    2019-05-16_22.16.22.png
                    2019-05-16_22.36.01.png
                    2019-05-16_22.36.02.png
                    2019-10-06_22.15.05.png
                    2019-10-12_17.00.21.png
                    2019-10-12_19.26.00.png
                    2019-10-12_19.26.07.png
                    2019-10-12_19.26.11.png
                    2019-10-12_19.26.29.png
                    2019-10-12_19.26.33.png
                    2019-10-13_01.48.38.png
                    2019-10-13_01.56.28.png
                    2020-06-07_06.10.02.png
                    2020-06-07_06.16.05.png
                    2020-06-07_06.16.08.png
                    2020-06-07_06.17.13.png
                    2020-06-07_06.29.43.png
                    2020-06-07_09.46.08.png
                    2020-06-07_09.46.11.png
                    2020-06-19_13.31.33.png
                    2020-06-23_12.53.30.png
                    2020-06-23_13.03.37.png
                    2020-07-01_02.41.13.png
                    2020-07-01_02.43.33.png
                    2020-07-01_03.02.32.png
                    2020-07-01_03.03.14.png
                    2020-07-01_03.05.56.png
                    2020-07-01_03.30.33.png
                    2020-07-01_05.04.23.png
                    2020-07-01_06.33.33.png
                    2020-07-01_06.34.51.png
                    2020-07-01_06.34.58.png
                    2020-07-01_06.35.00.png
                    2020-11-12_05.26.56.png
                    2020-11-13_08.26.09.png
                    2020-11-13_08.26.26.png
fcorbelli commented 9 months ago

This is expected behavior, due to the -all option, lan shares (aka: \pippo\something) and *nix /, and multiple .

you can see with -debug switch, and 25787 tag

when running on -longpath the filenames are "sanitized" this way

if (name.size()>2) // fix for *nix relative paths
    if (name[0]=='.')
        if (name[1]=='/')
            name[0]='_';
myreplaceall(name,"/./","/_/");
myreplaceall(name,"/../","/__/");
if (iswindowsunc(name))
{
    replace(name,"//","__");
    replace(name,"/","_");
}
if (name.size()>2)
    if (isalpha(name[0]))
        if (name[1]==':')
            if (name[2]=='/')
                name[1]='_'; /// THIS IS YOUR CASE

string finale=includetrailingbackslash(tofiles[0])+name;
fcorbelli commented 9 months ago

Finally you can use the -find to extract where... you want

Suppose you have the z:\1.zpaq with inside something like that

- 2023-12-12 13:40:23                   0 DA    c:/zpaqfranz/
- 2023-10-26 20:00:35                   0 D     c:/zpaqfranz/.github/
- 2023-10-26 20:00:35                   0 D     c:/zpaqfranz/.github/workflows/
- 2023-03-04 23:34:50                 844 A     c:/zpaqfranz/.github/workflows/github_actions_build.yml
- 2023-03-04 23:34:50                 427 A     c:/zpaqfranz/.gitignore
- 2023-03-04 23:34:50              12.618 A     c:/zpaqfranz/.travis.yml
- 2021-04-15 11:31:59                  52 A     c:/zpaqfranz/0.bat
- 2023-10-26 20:00:35                   0 D     c:/zpaqfranz/00000001/
- 2022-09-08 11:25:30                 798 A     c:/zpaqfranz/00000001/1.txt
- 2022-08-25 16:50:14              29.581 A     c:/zpaqfranz/00000001/3.txt
- 2022-08-25 16:50:23              22.307 A     c:/zpaqfranz/00000001/4.txt
- 2022-08-12 16:33:28              13.050 A     c:/zpaqfranz/00000001/cpuz.txt
- 2022-08-14 14:42:05                 153 A     c:/zpaqfranz/00000001/lavoretti.txt

Note the stored path (c:/zpaqfranz/ in this example)
Extract with -longpath in "z:\default"

zpaqfranz x z:\1.zpaq -to z:\default -longpath

And you'll get this

C:\zpaqfranz>tree z:\default
Elenco del percorso delle cartelle per il volume RamDisk
Numero di serie del volume: 00000073-2423:F92F
Z:\DEFAULT
└───c_
    └───zpaqfranz
        ├───.github
        │   └───workflows
        ├───00000001
        ├───00000002
        ├───00000003

Now "cut" (because there is no -replace "something") the stored path (c:/zpaqfranz/ in this example)

zpaqfranz x z:\1.zpaq -to z:\your -longpath -find "c:/zpaqfranz/"

And the result

Z:\YOUR
├───.github
│   └───workflows
├───00000001
├───00000002
├───00000003
(...)

Remember that the paths stored in zpaq can originate from different systems, here are the two switches -find and -replace to manipulate paths.

You can understand the problem even better if you use the -all switch

zpaqfranz x z:\1.zpaq -to z:\all_default -all -longpath

become

Z:\ALL_DEFAULT
└───0001
    └───c
        └───zpaqfranz
            ├───.github
            │   └───workflows
            ├───00000001
            ├───00000002
            ├───00000003

Hope this solves your problem

EpicGazel commented 9 months ago

Thank you for your help! I found it necessary to still include the original path to be extracted but this solved my issue.

> zpaqfranz x .\g_drive.zpaq "G:\.minecraft\screenshots\" -to "B:\screenshots\" -longpath -find "G:\.minecraft\screenshots\"
zpaqfranz v58.11z-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-11-10)
franz:-find                 <<G:/.minecraft/screenshots/>>
franz:-hw -longpath
31876: INFO: setting Windows' long filenames
./g_drive.zpaq:
4 versions, 99.499 files, 21.171.493.883 bytes (19.72 GB)
00000001 ?existing files skipped (-force overwrites).
Extract 28.880.896 bytes (27.54 MB) in 39 files (1 folders) / 16 T
        26.00% 00:00:00  (   7.16 MB)=>(  27.54 MB)    7.16 MB/sec

0.500 seconds (000:00:00) (all OK)
> tree /F B:\screenshots\
Folder PATH listing for volume Backup
Volume serial number is C264-09CB
B:\SCREENSHOTS
    2019-05-09_21.57.51.png
    2019-05-09_21.58.07.png
    2019-05-16_20.24.57.png
    2019-05-16_22.16.22.png
    2019-05-16_22.36.01.png
    2019-05-16_22.36.02.png
    2019-10-06_22.15.05.png
    2019-10-12_17.00.21.png
    2019-10-12_19.26.00.png
    2019-10-12_19.26.07.png
    2019-10-12_19.26.11.png
    2019-10-12_19.26.29.png
    2019-10-12_19.26.33.png
    2019-10-13_01.48.38.png
    2019-10-13_01.56.28.png
    2020-06-07_06.10.02.png
    2020-06-07_06.16.05.png
    2020-06-07_06.16.08.png
    2020-06-07_06.17.13.png
    2020-06-07_06.29.43.png
    2020-06-07_09.46.08.png
    2020-06-07_09.46.11.png
    2020-06-19_13.31.33.png
    2020-06-23_12.53.30.png
    2020-06-23_13.03.37.png
    2020-07-01_02.41.13.png
    2020-07-01_02.43.33.png
    2020-07-01_03.02.32.png
    2020-07-01_03.03.14.png
    2020-07-01_03.05.56.png
    2020-07-01_03.30.33.png
    2020-07-01_05.04.23.png
    2020-07-01_06.33.33.png
    2020-07-01_06.34.51.png
    2020-07-01_06.34.58.png
    2020-07-01_06.35.00.png
    2020-11-12_05.26.56.png
    2020-11-13_08.26.09.png
    2020-11-13_08.26.26.png

No subfolders exist

And for anyone whose come here in regards to a single file extraction, try: zpaqfranz x .\g_drive.zpaq "G:\.minecraft\screenshots\2019-05-09_21.57.51.png" -to "B:\screenshots" -longpath -find "G:\.minecraft\screenshots\"

Pay particular attention to drop the trailing "\" on the -to path in addition to only using -find the on the parent directory, otherwise it won't work.

fcorbelli commented 9 months ago

Two notes.

The first is-use the ". In general it is good and right, and basically necessary on Linux

The second, that I forgot to mention, is WHY
The problem lies in the possibility of having different unit paths Suppose to get an archive with something like this insite

c:/zpaqfranz/pippo.txt (1KB)
e:/zpaqfranz/pippo.txt (2KB)

This is a perfectly "legit" zpaq How to extract those two "pippo.txt" into single -to "z:\somewhere\" ?

z:\somewhere\pippo.txt (1KB)
z:\somewhere\pippo.txt (2KB)

Not good Same (even worse) problem with different versions

It is possible (and I have thought about it) to make an "intelligent" examiner who would behave differently depending on various circumstances (collisions or not) There is still some vestige in the code There are so many cases, though (including \\theserver\theshare\thefolder), and it starts to get really, really complicated to be reliable That is, operating in the same way, predictably (for better or worse) Then I make this decision

Finally, I note that this behayviour is DIFFERENT from zpaq's

Why -to does not work this way? For reasons of substitutability (almost) drop-in using -to zpaqfranz basically works like zpaq Instead, using -longpath (a switch specific to my software) changes the logic, for the reasons specified above On the other hand, if you write -longpath you are using zpaqfranz, not zpaq :)

Thanking you for the report, which allows me to give some useful information, I invite you, if you have not already done so, to put a github star and maybe a review on sourceforge I would appreciate it.