Closed Werve closed 3 months ago
Current PAKKA need a lot of work 😄
Also I noticed that it uses the built-in version of zpaqfranz, wouldn't it be better to be able to select a path to use a chosen version (or in the same folder) so as to avoid cases where the development of the two projects are not close?
In fact no No because newer zpaqfranz.exe (64 bit) does have an autoupdate Then, sooner or later, even PAKKA will autoupdate zpaqfranz, if needed
I had virtually zero feedback for 10 years on PAKKA, now it seems some users are there I will see if I can improve a little Thanks for the reports
I was trying PAKKA for convenience, to store backups of all files on a disk but due to the problems described I finally preferred to proceed via console.
I know that backups are not a primarily suitable use but I have not found any other program that can store deduplicated backups with versions as if they were differential and you can quickly remove a previous version to regain space (from what I understand you can use the -repack
command that copies already compressed blocks so effectively you can quickly remove a single version for example).
For now, however, I have noticed a problem for this use and a concern. The problem is the lack of symbolic file archiving and the question is whether using archiving via VSS chooses all files (I knew that by default VSS excludes Outlook files https://learn.microsoft.com/en-us/windows/win32/vss/excluding-files-from-shadow-copies?redirectedfrom=MSDN)
Edit: Also, I noticed that the -windate
option doesn't store creation dates, so it could be another issue in that use case
I was trying PAKKA for convenience, to store backups of all files on a disk but due to the problems described I finally preferred to proceed via console.
PAKKA is, in fact, an EXTRACTER. It is born to make mysql dump's restoring easier I do not work very much on creating an archive. The idea is to have a GUI interface for versioned backups, something that is not normally there for other programs. With PAKKA you can add data to a file as you go with a minimal amount of clicks (at least, if and when I finish it)
I know that backups are not a primarily suitable use but I have not found any other program that can store deduplicated backups with versions as if they were differential and you can quickly remove a previous version to regain space (from what I understand you can use the
-repack
command that copies already compressed blocks so effectively you can quickly remove a single version for example).
In the latest versions it is possible to delete added versions, clearly this causes all data added after the time of "cutting" to be lost. It is not possible to quickly resume space in a deduplicated archive (without heavy processing) You can remove a single, or many, versions However, only the LAST ones
For now, however, I have noticed a problem for this use and a concern. The problem is the lack of symbolic file archiving and the question is whether using archiving via VSS chooses all files (I knew that by default VSS excludes Outlook files https://learn.microsoft.com/en-us/windows/win32/vss/excluding-files-from-shadow-copies?redirectedfrom=MSDN)
Symlinks are not supported at all, I started to implement a tar-like mechanism, but Windows is so messy and undocumented that I let it go
Edit: Also, I noticed that the
-windate
option doesn't store creation dates, so it could be another issue in that use case
Actually it should. However, it is an option that I personally never use, so I do not check its functionality If you have any repeatable examples please propose them, and I will correct any errors
I tried the last 2 versions of PAKKA GUI on the site (by clicking "Browse PAKKA builds," in settings) and trying to create an archive with, for example, a file and the following options: force longpath store windate store file hash, default method 1 VSS Use ADS force zfs force Windows Removing the checkmark to make a backup
An error is shown indicating that only one hash file method should be chosen. Trying to remove some of the options listed above the file is created but in backup mode (a choice I had disabled from the interface). Can you please explain better what do you want to do? Thanks
In the latest versions it is possible to delete added versions, clearly this causes all data added after the time of "cutting" to be lost. It is not possible to quickly resume space in a deduplicated archive (without heavy processing) You can remove a single, or many, versions However, only the LAST ones
So assuming an input.zpaq archive with 3 versions to which I want to remove version 2 I cannot proceed by doing:
zpaqfranz x input.zpaq -repack output.zpaq -until 1
zpaqfranz x input.zpaq -repack output.zpaq -until 3
Therefore copying only the blocks referenced by versions 1 and 3 into "output.zpaq" ?
Actually it should. However, it is an option that I personally never use, so I do not check its functionality If you have any repeatable examples please propose them, and I will correct any errors
You can try to archive the following folder (after zip extract) Creation Date 1-1-2016.zip
with the following command:
zpaqfranz -method 1 -verbose -utf -windate -force -forcezfs -forcewindows -filelist -longpath a test.zpaq ".\Creation Date 1-1-2016"
Both the file and the folder should have: creation date: 1/1/2016 1:1:1 modified date: 1/1/2020 1:2:3
But by extracting the archive (e.g. with PAKKA) only the modified dates are preserved.
In the latest versions it is possible to delete added versions, clearly this causes all data added after the time of "cutting" to be lost. It is not possible to quickly resume space in a deduplicated archive (without heavy processing) You can remove a single, or many, versions However, only the LAST ones
So assuming an input.zpaq archive with 3 versions to which I want to remove version 2 I cannot (...)
If you want to QUICKLY (aka: almost in no time) drop versions you can (from 59.4h) use the new crop command
C:\zpaqfranz>zpaqfranz crop vers.zpaq
zpaqfranz v59.4h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-08)
franz:-hw
vers.zpaq:
4 versions, 72 files, 2.343.688 bytes (2.23 MB)
---------------------------------------------------------------------------
< Ver > < date > < time > < version size > < Offset (w/out encr)>
V00000001 2024-05-05 19:21:52 [ 2.551] @ 2.551
V00000002 2024-05-05 19:22:05 [ 2.604] @ 5.155
V00000003 2024-05-05 19:22:17 [ 2.329.993] @ 2.335.148
V00000004 2024-05-05 19:22:30 [ 8.540] @ 2.343.688
no -kill, this is just a dry run
You can crop the archive, for example, at version 2. This will discard versions 3 and 4, and the file will become (in the example) 5.155 bytes long
You CANNOT delete version 3, keeping 1, 2 and 4
The crop command is used to delete versions added by mistake (sometimes it happens) or to delete old copies that are useless Translation Typical scenario: first version of a fileserver is 100GB large Later versions are each 1GB large, and let's say there are 50 The archive is now 100+50 1 = 150GB For some reason I have no interest in keeping the last 20 versions. I crop/drop them, and the archive will be 100+30 1 = 130GB. Then I launch an update, and it will go back to being aligned with the current data
Maybe this will become "drop" instead of "crop". More accurate
On -windate:
zpaqfranz l z:\thearchive.zpaq -windate
will show creation dates (if ever)
Checking for bugs in progress...
I tried the last 2 versions of PAKKA GUI on the site (by clicking "Browse PAKKA builds," in settings) and trying to create an archive with, for example, a file and the following options: force longpath store windate store file hash, default method 1 VSS Use ADS force zfs force Windows Removing the checkmark to make a backup An error is shown indicating that only one hash file method should be chosen. Trying to remove some of the options listed above the file is created but in backup mode (a choice I had disabled from the interface). Can you please explain better what do you want to do? Thanks
I was indicating that PAKKA does not seem to properly consider the options chosen by GUI for compression, in many cases despite not having chosen it or having just removed the checkmark on multipart backup mode, I have noticed that it still creates the archive that way.
I then proceeded via command line. I would like to archive all the files contained in a drive, say Z:\ of size about 1TB, as if it were a full backup. Later re-do the same operation with the same file.zpaq as if they were differential backups (thanks to deduplication). Sooner or later, though, too much space would be consumed, so I would like to remove versions to free up space.
Basically this is what you do with a backup system but with deduplication, which unfortunately I have not found straightforward programs for such work on Windows.
zpaqfranz a test.zpaq ".\Creation Date 1-1-2016" -windate -force -longpath
Then
Z:\>zpaqfranz l test.zpaq -windate
zpaqfranz v59.4h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-08)
franz:-windate -hw
test.zpaq:
1 versions, 1 files, 1.258 bytes (1.23 KB)
- 2020-01-01 01:02:03 (C) 2016-01-01 01:01:01 94 A Z:/Creation Date 1-1-2016/Test datestamps.txt
94 (94.00 B) of 94 (94.00 B) in 1 files shown
1.258 compressed Ratio 13.383 <<test.zpaq>>
0.016 seconds (000:00:00) (all OK)
And then
Z:\>zpaqfranz x test.zpaq -windate -longpath -to z:\restored
zpaqfranz v59.4h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-08)
franz:-to <<z:/restored>>
franz:-windate -hw -longpath
INFO: setting Windows' long filenames
test.zpaq:
1 versions, 1 files, 1.258 bytes (1.23 KB)
Extract 94 bytes (94.00 B) in 1 files (0 folders) / 32 T
Files to be worked 1 => founded 1 => OK 1
0.032 seconds (000:00:00) (all OK)
If I understand you want FOLDER date creation too, not only FILE date creation Is it right?
(...) Just a bug (one of many, or better "an options"
I then proceeded via command line. I would like to archive all the files contained in a drive, say Z:\ of size about 1TB, as if it were a full backup. Later re-do the same operation with the same file.zpaq as if they were differential backups (thanks to deduplication). Sooner or later, though, too much space would be consumed, so I would like to remove versions to free up space.
Basically this is what you do with a backup system but with deduplication, which unfortunately I have not found straightforward programs for such work on Windows.
Simply, you can't The data will stay "forever" inside the archive One method is to use the -freeze switch, that is, archiving backups that become too large to start over again
On -windate:
zpaqfranz l z:\thearchive.zpaq -windate
will show creation dates (if ever)
Checking for bugs in progress...
In the example described through that command the creation dates are also reported but trying to extract, even adding -windate
the created files have only the correct modification date.
zpaqfranz a test.zpaq ".\Creation Date 1-1-2016" -windate -force -longpath
Then
Z:\>zpaqfranz l test.zpaq -windate zpaqfranz v59.4h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-08) franz:-windate -hw test.zpaq: 1 versions, 1 files, 1.258 bytes (1.23 KB) - 2020-01-01 01:02:03 (C) 2016-01-01 01:01:01 94 A Z:/Creation Date 1-1-2016/Test datestamps.txt 94 (94.00 B) of 94 (94.00 B) in 1 files shown 1.258 compressed Ratio 13.383 <<test.zpaq>> 0.016 seconds (000:00:00) (all OK)
And then
Z:\>zpaqfranz x test.zpaq -windate -longpath -to z:\restored zpaqfranz v59.4h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-08) franz:-to <<z:/restored>> franz:-windate -hw -longpath INFO: setting Windows' long filenames test.zpaq: 1 versions, 1 files, 1.258 bytes (1.23 KB) Extract 94 bytes (94.00 B) in 1 files (0 folders) / 32 T Files to be worked 1 => founded 1 => OK 1 0.032 seconds (000:00:00) (all OK)
If I understand you want FOLDER date creation too, not only FILE date creation Is it right?
Yes, I would like to see that preserved as well. But also the creation date of the extracted file doesn't match the initial date. Edit: No, I checked again and the extracted file seems to have the correct dates. Probably both PAKKA and in the first attempt -windate was not added in the extraction as well.
This seems OK to me
Folder's timestamp require a bit of heuristics
This seems OK to me
Folder's timestamp require a bit of heuristics
Yes, sorry I edited as soon as I noticed.
Probably neither PAKKA nor the first extraction attempt added -windate
also to extract.
(...) Just a bug (one of many, or better "an options"
I then proceeded via command line. I would like to archive all the files contained in a drive, say Z:\ of size about 1TB, as if it were a full backup. Later re-do the same operation with the same file.zpaq as if they were differential backups (thanks to deduplication). Sooner or later, though, too much space would be consumed, so I would like to remove versions to free up space. Basically this is what you do with a backup system but with deduplication, which unfortunately I have not found straightforward programs for such work on Windows.
Simply, you can't The data will stay "forever" inside the archive One method is to use the -freeze switch, that is, archiving backups that become too large to start over again
Would it be possible to use the new drop/crop command to remove early versions instead of late versions? In the described use case of backup chains, this might be enough.
Would it be possible to use the new drop/crop command to remove early versions instead of late versions? In the described use case of backup chains, this might be enough.
No, it is not You have to repack But, in reality, it is never done I, at least, don't do it When an archive gets too big, I move it to external media (like USB HD) and at the next run it will be recreated automatically
I checked the issue of the date of creation of the folders. It is simply not an information stored in the standard zpaq format. zpaqfranz uses an extended block of data, in which it stores the hashes of the files and with --windate--just the date. But the folders are devoid of information about the hashes, and therefore also of the date of creation I could actually create a handle on this date as well, moving it out of the hash block It is a doable job, however, it is not trivial For now, I'd say let's just gloss over it, assuming there are no other requests from other users
OK, please try the attached pre-release with -windate This will restore on folders too
IF folders does exists
Translation This will include the folder
zpaqfranz a test "Creation Date 1-1-2016"
This does not
zpaqfranz a test "Creation Date 1-1-2016\*"
In first case TWO objects will be stored
2020-01-01 01:02:03 (C) 2016-01-01 01:01:01 0 D Creation Date 1-1-2016/ 2020-01-01 01:02:03 (C) 2016-01-01 01:01:01 94 A Creation Date 1-1-2016/Test datestamps.txt
In the second just the file
- 2020-01-01 01:02:03 94 A Creation Date 1-1-2016/Test datestamps.txt
I will not write a heuristic to automagically solve this situation, which is too complex and lacks real utility Short version: add folders and not files to the archive, if you want folders to be "touched"
OK, please try the attached pre-release with -windate This will restore on folders too
IF folders does exists
Translation This will include the folder
zpaqfranz a test "Creation Date 1-1-2016"
This does not
zpaqfranz a test "Creation Date 1-1-2016\*"
In first case TWO objects will be stored
2020-01-01 01:02:03 (C) 2016-01-01 01:01:01 0 D Creation Date 1-1-2016/ 2020-01-01 01:02:03 (C) 2016-01-01 01:01:01 94 A Creation Date 1-1-2016/Test datestamps.txt
In the second just the file
- 2020-01-01 01:02:03 94 A Creation Date 1-1-2016/Test datestamps.txt
I will not write a heuristic to automagically solve this situation, which is too complex and lacks real utility Short version: add folders and not files to the archive, if you want folders to be "touched"
I confirm that it works even with the creation date of the folder IF I don't add -longpath
(...) Just a bug (one of many, or better "an options"
I then proceeded via command line. I would like to archive all the files contained in a drive, say Z:\ of size about 1TB, as if it were a full backup. Later re-do the same operation with the same file.zpaq as if they were differential backups (thanks to deduplication). Sooner or later, though, too much space would be consumed, so I would like to remove versions to free up space. Basically this is what you do with a backup system but with deduplication, which unfortunately I have not found straightforward programs for such work on Windows.
Simply, you can't The data will stay "forever" inside the archive One method is to use the -freeze switch, that is, archiving backups that become too large to start over again
In case anyone are looking for a solution for the use case I described, I ended up having to use the NTFS data deduplication feature (which is usually only present on Windows server editions but there are workarounds for other editions) into a VHDX file container to support mount as well. So it can preserve symlinks, hardlinks, junction points, datestamps, and the flexibility to add and remove files like any folder.
Hopefully, someday the new ReFS filesystem will be fully supported by client versions of Windows and you can just use that, due to the recent additions of deduplication and post-compression.
But of course the best for cross-OS compatibility is an open source archive and tool like zpaqfranz :)
In this case I use truecrypt/veracrypt virtual disks, with zpaqfranz (without compression, aka -m0) for backup In the case of server I make a sector-level image (-image), but it requires zero (f) free sectors
I considered using veracrypt but unfortunately it does not have a storage system that allows you to reclaim unused space if you later delete files in the container. Whereas by creating a dynamically sized VHDX you can either manually and precisely automatically expand it to the chosen size or reclaim the space no longer actually used in the container via Optimize-VHD function (better to defrag the free space first so that it is contiguous). For encrypting VHDX transparently I can use Bitlocker (it keep the ease of mount).
This will f*up the deduplication, due to moved blocks I use veracrypt+zpaqfranz from years It just works
Are you referring to the Optimize-VHD command ? I ran several days of testing with NTFS deduplication (appropriate commands) + Bitlocker + Defrag all inside a VHDX container. and so far it worked, files remained readable and hash matched.
It all started after I read this article: https://www.deploymentresearch.com/beyond-zip-how-to-store-183-gb-of-vms-in-a-19-gb-file-using-powershell/
I work with .vmdk and zpaq from 10 years 😄 No need for such complexity zpaq with -key That's all
On zfs systems it is even better
But, based on the previous conversation, you can't reclaim space from a zpaq archive by removing files from previous versions (except the latest one), right? I ended to use this system because it is the only one I have found that also allows you to delete any files effectively reducing the space used and with deduplication features (which is well integrated for Windows, unlike ZFS).
It is possible to purge zpaq (with a bit of effort)... but... why? It is zpaq's best feature for disaster recovery. Space occupancy is typically minimal, becoming significant after months or hundreds of versions Deleting data from a backup is THE NO-NO
PS Windows' deduplication is crap ZFS just about as crap, but better than Windows The main use is quite counterintuitive, i.e., it serves to minimize WRITING during operation checks of backups, i.e., extraction of the entire contents of an archive
It is possible to purge zpaq (with a bit of effort)... but... why? It is zpaq's best feature for disaster recovery. Space occupancy is typically minimal, becoming significant after months or hundreds of versions Deleting data from a backup is THE NO-NO
In my case, I wanted to reduce the size of some full HDD backups, accumulated for years. Since there will surely be several files and equal portions left over the years through deduplication I could save space. But I intend to continue with full backups still for years in the future and already currently I had almost run out of space in the HDDs in which I store them. So sooner or later I would have to remove the older backups. Or of course you can always buy more space but I did not want to proceed in that direction since it concerns a personal and not a work situation anyway.
Deleting old data is not a practice I recommend; I have all the individual files since 1993. Deduplication doesn't make much sense as a space-saving methodology, it makes a lot more sense for different versions, i.e. snapshots Incidentally you can quickly estimate the amount of duplicate files (files, not parts of files) with the sum command and the -quick switch (ad hoc)
I tried compression before but a lot of time was wasted for almost no space saved since almost all files are not-compressible (like video). Since it is very likely that in the various backup chains there were repeated all those files several times then the biggest space savings I achieved is through deduplication. Many times, however, it happens, again to cite the example of videos, that there are also many clips clipped from a longer video. So they are still no-compressible (because of the work of the video encoders) but fully contained in the longer video. Another case where with deduplication ( partial data) gets a savings in used space.
Unfortunately, I found no other way. And even with deduplication it is very likely that in a couple of years at this rate of backups I will run out of space and will have to remove older backups.
Video cannot be deduplicated very well You'll have to buy more hardware, or delete something
Video cannot be deduplicated very well You'll have to buy more hardware, or delete something
Yes, that's what I'm saying. In my use case unfortunately I have no alternatives. And I don't plan on buying more space anymore so I will have to remove older files in the future. For this reason I have already worked to find a storage system that will allow me to do this when I'll have like 1 MB left free.
I tried the last 2 versions of PAKKA GUI on the site (by clicking "Browse PAKKA builds," in settings) and trying to create an archive with, for example, a file and the following options: force longpath store windate store file hash, default method 1 VSS Use ADS force zfs force Windows Removing the checkmark to make a backup
An error is shown indicating that only one hash file method should be chosen. Trying to remove some of the options listed above the file is created but in backup mode (a choice I had disabled from the interface).
Also I noticed that it uses the built-in version of zpaqfranz, wouldn't it be better to be able to select a path to use a chosen version (or in the same folder) so as to avoid cases where the development of the two projects are not close?