Yutaka-Sawada / MultiPar

Parchive tool
998 stars 44 forks source link

Using "Split Files" option in GUI causes error to be reported #117

Closed AreteOne closed 9 months ago

AreteOne commented 9 months ago

I create monthly full backups of various sets of my data. Each backup is then zipped into a .7z archive broken into 250MB pieces. After verifying the integrity of the archive, I create PAR2s at 10% of these files.

Since some of these data sets are > 100GB, a PAR2 file can be several GBs in size itself. Because losing one of these would greatly reduce the ability to rebuild an archive if needed, I selected the "Split Files" option and set the limit to 100MB. This accomplishes creating a comparatively large number of PAR2 files, so losing one or two of them wouldn't impact being able to repair the archive.

However, every time I've generated PAR2 files since selecting this option, when the process finishes, it reports an error as shown on the attached screen cap. Closing MultiPar and then testing the PAR2s reports everything's fine. I also deleted one of the pieces of an archive, and MultiPar successfully rebuilt it, as desired. So, it appears that there's not really an error, but rather a false report of one.

The other thing that happens now is that instead of opening to the last folder used to select files for creating PAR2s, it now opens to the install directory. My guess would be that because the creation ended with an error, the program decides it can't save the last directory as where to open the next time a user wants to add files.

I've attached screen caps of the last PARs set I created, but this has happened with very large data sets, too, and has happened every time since selecting the "Split Files" option so I can keep the size of each PAR2 decently small to spread out the risk of losing one.

Screenshot 2024-02-08 081203 Screenshot 2024-02-08 081323 Screenshot 2024-02-08 081538

Yutaka-Sawada commented 9 months ago

Thank you for bug report. Your case is double split. I made my par2j to refuse splitting again. Because you splited files with 7-Zip already. Mine failed to split again.

For example, there is an original archive file "something.7z". You splited the archive with 7-Zip to "something.7z.001", "something.7z.002", and "something.7z.003". When MultiPar found such filenames with numerical extension, it returned error. This error was made to avoid over-writing other source files with splited files.

Case without over-write problem: "something.7z.001" -> "something.7z.001.001", "something.7z.001.002" "something.7z.002" -> "something.7z.002.001", "something.7z.002.002" "something.7z.003" -> "something.7z.003.001", "something.7z.003.002"

Case with over-write problem: "something.7z" -> "something.7z.001", "something.7z.002" <- same filenames ! "something.7z.001" -> "something.7z.001.001", "something.7z.001.002" "something.7z.002" -> "something.7z.002.001", "something.7z.002.002" "something.7z.003" -> "something.7z.003.001", "something.7z.003.002"

I improved possible filename checker. I will detect bad filename correctly. I put the new sample par2j in "alpha" directory of GitHub. Please test with it. But, some file splitting applications may not determine double split, and will fail to join them. From my test, 7-Zip seems to be able to join correctly.

AreteOne commented 9 months ago

I downloaded three files from the alpha directory: MultiPar.exe, par2j.exe, and par2j64.exe, and overwrote the files of the same name in the install directory with these three. Now when I attempt to run the program, I get a message from Windows 11 saying "This app can't run on your PC".

Looking at the shortcut, it's attempting to run MultiPar.exe. Based on time stamps on the files in the alpha director, these look like the only three that have changed recently. What am I missing?

Yutaka-Sawada commented 9 months ago

What am I missing?

I don't know your case. The error "This app can't run on your PC" is a problem of Windows 11. But, Microsoft alerts downloading newly created EXE file on Windows 10. (It shows error, even though I made it myself.) Though you only need to update "par2j64.exe" to solve this failure, you may download whole files in ZIP archive. At the main page, push "Code" button, select "Download ZIP".

Slava46 commented 9 months ago

I downloaded three files from the alpha directory: MultiPar.exe, par2j.exe, and par2j64.exe, and overwrote the files of the same name in the install directory with these three. Now when I attempt to run the program, I get a message from Windows 11 saying "This app can't run on your PC".

Just allow by windows defender to start those .exe's. Windows don't like unknown .exe and block them always.

AreteOne commented 9 months ago

It's not a Windows Defender issue; it's not blocking running the program.

I restored the three files from previous install, and the program ran as it did previously. Then, I only copied par2j64.exe over the current version and tried again. MultiPar opened and allowed me to select the files and adjust the parameters, but when I clicked on the "Create" button, I received the Windows error message "Invalid checksum - parj64 in incomplete. Screenshot 2024-02-10 040540

AreteOne commented 9 months ago

So - I'm currently testing a new creation of PAR2s on one of my data sets, so I got it to run.

My original download was corrupted somehow. I had downloaded the files from the Alpha directory by right-clicking on the file name and choosing the "Save link as" option, which opened the file dialog box and offered the "par2j64.exe" name, so I assumed it was a direct link to the file.

I went back and clicked on the specific file, which went to a web page just for this file, and then used the "download raw file" button on the right. While the same name was offered for the download, what was downloaded now was a bit smaller in size, and the creation process started right up as expected.

So, apparently, how files are downloaded makes a difference! I'll report back once testing has finished. Thanks!

AreteOne commented 9 months ago

Testing has finished, and while it no longer returns an error, it's also now doing something that's not wanted.

The desired result is to limit the size of any given PAR2 file so that losing the ability to read a PAR2 file minimizes how many recovery blocks are lost. There is no desire to change the size of any of the source files.

What's happening now is (a) the maximum size of any given PAR2 is limited to the limit set. That's desired. And (b), each source file is now being duplicated by breaking it up into multiple parts, each of which is also limited to the limit set. That's doubling the size of my source data set. This is definitely not desired.

What I was going to suggest had this testing worked as I anticipated is for the wording to be changed to something along the lines of "Limit maximum PAR2 file size to " and then provide the ability to set the maximum size.

The problem is that without setting a maximum PAR2 file size, I'm getting individual PAR2 files that are as large as 9GB. If I lose a 9GB PAR2, that's a lot of blocks that wouldn't be available to rebuild a damaged source file.

And, I really don't understand the function of having the program that's creating parity files of a source also split up the source files. After all, wouldn't it be the case that the size of source files, as presented to the PAR program, be assumed to what's desired?

Perhaps the solution here, if you desire to retain the feature that splits up the source files during the creation of the PAR2 files, is to add the ability to limit the maximum size of any given PAR2 file under "Options", perhaps in the "Client behavior" tab under "Creation options".

Ironically, I got the desired results before the change to parj64, because it didn't split the source files, but it did limit the size of any given PAR2 file, and even though it reported an error, testing showed the PAR2s were fine and could rebuild a missing file. I got a lot of PAR2 files, all of which were no larger than the maximum size set, and this is what I'm looking for because if I lose one or two of them, there's still plenty of blocks available to rebuild damaged source files.

So - is this possible? To have an option that doesn't touch the source files, but does limit the maximum size of any given PAR2 file?

Thanks.

AreteOne commented 9 months ago

I did some additional testing, and thought I'd share my experiences.

I noticed that when I launched a verification of the PAR2 files against the source data set by double-clicking the ".PAR2" file, MultiPar opens and only shows the full-sized source files; none of the split-up segments are shown or are part of the verification process.

I deleted all of the source segments, leaving only all of the original source files. Again, the verification process completely successfully.

Then, I deleted the last source file. The verification process showed that it was missing. I have automatic repair enabled, so that started by running the entire verification process again (that seems redundant - it just finished, why run it again?) and upon completion of the second verification, again found that one of the source files was missing and then rebuilt it, completing with the progress bar showing "All Files Complete" in green, as expected.

So - if the split-up source files aren't used by the verification process or needed for rebuilding, what is the purpose of creating them and writing them to the hard drive?

Yutaka-Sawada commented 9 months ago

what is the purpose of creating them and writing them to the hard drive?

At the era of QuickPar, people needed to split large files to save them in small floppy disks. Because the usage of MultiPar inherits QuickPar, "Split Files" check-box and "Limit Size to" item will split source files. But, the limit value adapts to recovery files, too.

So - is this possible? To have an option that doesn't touch the source files, but does limit the maximum size of any given PAR2 file?

Now, you don't want to split source files, as they were splited by 7-Zip already. There is a tricky method for you. The "Limit Size to" value adapts, only when it calculates sizes of possible PAR2 files. The calculated sizes won't be changed, when you un-check "Split Files" later. Thus, you can un-check the "Split Files" just before pushing "Create" button. You may see the difference by pushing "Preview" button.

1) Check "Split Files" at first. 2) Change redundancy to re-calculate sizes of PAR2 files. 3) Un-check "Split Files". PAR2 files size are limited still. 4) Create PAR2 files.

Also, I write another method, which you may refer on "Command_GUI.txt". You may enable "Variable (limited to size of largest data file)" as sizing scheme. If you want to limit size of Recovery Files automatically, write a line of "RecoveryFileLimit=1" under "[Option]" section. With this option, size of the largest source files adapts to "Limit Size to" automatically. This may be useful, if you are lazy to change "Split Files" item every time.

Yutaka-Sawada commented 9 months ago

I modified RecoveryFileLimit item in "MultiPar.ini" file. Now, "RecoveryFileLimit=2" will adapt "Limit Size to" value to limit size of recovery files always. With this setting, you won't need to check & un-check "Split Files" check-box everytime. The sample is available at alpha directory of GitHub MultiPar page.

AreteOne commented 9 months ago

I've modified the .ini file, and this accomplishes the desired result. Thanks!

Two other items remain. (1) When I run a PAR2 verification on a data set where I've intentionally deleted one of the files, it correctly finds there's a missing file, and because I've got "Repair automatically" selected, it then proceeds to do the needed repair. But - before it begins the repair, it runs a complete second verification process, duplicating the work it just did, and then does the file repair. Is this the designed behavior? (2) When I first installed MultiPar on this PC and used it to create parity files, the next time I ran the program, clicking on Add Files would open the File Manager window to the folder last used. This is very helpful, as my data sets are organized together. Once I generated the error that started this thread, the program no longer remembered the last folder used, but instead now always opens to the MutliPar installation folder. Is there a way to get the program to remember the last folder used, and to open the File Manager dialog in that folder?

Thanks.

Yutaka-Sawada commented 9 months ago

Is this the designed behavior?

Yes, it needs to verify files before repair. To reduce verification time, you should not select "Not used" at "Re-use verification result" option.

Is there a way to get the program to remember the last folder used, and to open the File Manager dialog in that folder?

No, there isn't. Pushing "Add Files" button opens File Manager dialog. The initial directory is "Base directory" or current directory. If you open a same folder always, you may change "working directory" in MultiPar short-cut. Or you may use "SendTo" right-click menu to select a source file. After a file was selected, the file's directory is set to "Base directory" automatically.

AreteOne commented 9 months ago

Got it. Thanks for your help.