Closed kalkwarf closed 1 year ago
Hi,
I am a bit confused about what kind of an issue you are describing here. SWCompression APIs are not aware of such concepts as relative or absolute paths. They just faithfully the contents of the supplied archive in a convenient form (this includes TarEntryInfo.linkName
), and it's up to the user to interpret these values as either kind of a path.
I suppose you're talking about the usage of these APIs in swcomp, specifically, this line here. If that's the case, then I should emphasize once again that swcomp command-line tool is not intended for general use. It serves only two purposes: demonstration of how SWCompression APIs can be used (and not how they should be used), and as an "internal" testing facility. To illustrate this, note that "hard" links are completely ignored in swcomp and rejected as an "unknown" entry type.
On the subject of the actual issue, I do not know what behavior (the current one with absolute paths or the proposed with relative ones) is more correct. I've skimmed through the various TAR-related material listed in the README to refresh my knowledge on the subject matter and to my understanding the absolute/relative path issue is not specified.
I am interested in any other references that may explicitly specify the proper way of handling links, and no, the Mac's Archive Utility's current behavior does not count, as it may change at any point and I do not fully trust them with the correctness of their implementation.
Sorry, yes, I was referring to swcomp
as I was using it as a reference while working on my own project.
The archive that started me down this path is at: https://github.com/Homebrew/brew/tarball/master
Looking at the GitHub repo, I can see that the original symlinks were relative: https://github.com/Homebrew/brew/blob/master/Library/Homebrew/test/support/fixtures/bottles/testball_bottle-0.1.x86_64_linux.bottle.tar.gz
but upon extraction with swcomp, they are written as absolute links.
Dumping the TAR's table of contents, I can see the destination was recorded as relative:
lrwxrwxrwx 0 root root 0 Sep 20 12:36 Homebrew-brew-a6aab4b/Library/Homebrew/test/support/fixtures/bottles/testball_bottle-0.1.aarch64_linux.bottle.tar.gz -> testball_bottle-0.1.yosemite.bottle.tar.gz
While I can't find any documentation that discusses this, it seems like relative is defined in the archive itself. 🤷
After thinking a bit more about this and some experimentation, I am inclined to agree that the current behavior is not ideal. It has been fixed in f191db24948393143f5d62b860475aa708bb02e2.
While working on this I've uncovered a couple of more issues with the current TAR implementation:
All of these were fixed in 4.8.3.
In addition, I have also discovered that the Apple-supplied TAR implementation on macOS actually reverses the direction of hard links. Basically, if you have a "link" hardlink that links to a "file" then in the resulting TAR archive it will be the other way around, "file" will be a hardlink to a "link"...
While it may be possible that I am wrong and I am missing something here, but generally speaking this is why I do not consider Apple's Archive Utility as a reference implementation (GNU Tar behaves the expected way).
P.S. The last two issues I decided to report to Apple as FB11712450 and FB11712441.
Extracting a tarball containing symlinks results in symlinks with absolute paths. For example:
This makes the directory non-portable, as relocating
symlink-test
will break the target path.The solution for this is to see if the source and destination share a prefix, and if so, rewrite the destination to be relative. This will give a relative link, like so:
This also matches the results when using the Mac's Archive Utility to extract the files.
I have a fix coded up, but need to write some tests before I can open a PR.