tsolomko / SWCompression

A Swift framework for working with compression, archives and containers.
MIT License
238 stars 41 forks source link

Allow specifying format when creating TAR #24

Closed LebJe closed 3 years ago

LebJe commented 3 years ago

Hi @tsolomko,

I would like to know if it is possible to add GNU tar support to SWCompression.

I would like this feature because I am trying to build a deb package that will be installed by apt. apt is able to read the first tar file, but the second tar causes apt to crash with this error:

...
corrupted filesystem tarfile in package archive: unsupported PAX tar header type 'x'

termux-create-package fixed the "unsupported PAX tar header type" issue by using the GNU tar format.

NFPM also uses the GNU format for building debs.

Jeff

tsolomko commented 3 years ago

Hi @LebJe,

I have some reservations against this idea, that I would like to put on record here. Namely, I believe that PAX format is superior to the GNU tar format, at least for the following reasons:

  1. In general, the TAR format is very ASCII-biased. This means that all fields, such as file names, must be encoded using ASCII, whereas PAX headers allow UTF-8 encoding. In practice, this is not a big problem for SWCompression, since I ignore this requirement and encode everything using UTF-8 even in "normal" TAR fields, because I believe that forcing ASCII onto users in 21st century is unacceptable (and also UTF-8 is backwards compatible with ASCII).

  2. By default, numeric fields are quite limited in terms of their maximum values. For example, using TAR without PAX headers you may store only files of the size up to 2^(3*12)-1 bytes, which is approximately 64 GB. There is an extension for the basic TAR format which allows greater file sizes (up to approx. 7*10^19 GB), but it is a very non-standard extension (SWCompression supports it, though).

In light of this, in my humble opinion, it would be better if dpkg/apt/temux/nfpm/whoever would support PAX headers instead, as they are the most extensible and future-proof format option available.

That said, I think that there is some value in allowing users to choose which format extensions to use when creating a new TAR archive, so I will try to do something about it. Please note, though, that this is a very non-trivial feature, so I can't provide any timeframe for its implementation.

P.S. I also think that the title of the issue is a bit misleading (SWCompression in fact supports reading GNU tar format), so I took the liberty to change it.

LebJe commented 3 years ago

That said, I think that there is some value in allowing users to choose which format extensions to use when creating a new TAR archive, so I will try to do something about it. Please note, though, that this is a very non-trivial feature, so I can't provide any timeframe for its implementation.

Thank you for deciding to implement this feature. Is there anything I can do to help?

In light of this, in my humble opinion, it would be better if dpkg/apt/temux/nfpm/whoever would support PAX headers instead, as they are the most extensible and future-proof format option available.

Unfortunately, it seems that dpkg does not support the PAX extended header, which forces anyone who wants to build a deb package to use the GNU format.

tsolomko commented 3 years ago

I don't know how you can help.

The problem is that, I think, before adding new features, the TAR stuff in SWCompression warrants some rewriting or restructuring, because it looks hard to understand even for me, who wrote it in the first place. I already know what I would like to do, and it seems relatively straightforward (both rewriting and this feature), so I only need to find some time to sit down and implement it.

tsolomko commented 3 years ago

For you information, I've just released a pre-release version (4.6.0-test) of a new update where among other things I've added a new function, TarContainer.create(from:force:). This function, if you use .gnu in the second argument, should help you do what you want.

Feel free to try this version out, and let me know if you encounter any issues, or have any other feedback. Regardless, the final version is planned to be released some time next week.

tsolomko commented 3 years ago

4.6.0 has been released

LebJe commented 3 years ago

@tsolomko Thank you very much. I have tested the changes and they work perfectly!