megastep / makeself

A self-extracting archiving tool for Unix systems, in 100% shell script.
https://makeself.io
GNU General Public License v2.0
2.3k stars 371 forks source link

makeself "--posix" argument in tar for increased compatibility actually broke compatibility in Alpine Linux (BusyBox) #240

Open ntx-ben opened 3 years ago

ntx-ben commented 3 years ago

Using makeself v2.4.3 in Alpine Linux 3.12:

Header is 682 lines long
About to compress 301096 KB of data...
Adding files to archive named "<REDACTED>"...

tar: unrecognized option: posix
BusyBox v1.31.1 () multi-call binary.
Usage: tar c|x|t [-ZzJjahmvokO] [-f TARFILE] [-C DIR] [-T FILE] [-X FILE] [--exclude PATTERN]... [FILE]...
Create, extract, or list files from a tar file
    c   Create
    x   Extract
    t   List
    -f FILE Name of TARFILE ('-' for stdin/out)
    -C DIR  Change to DIR before operation
    -v  Verbose
    -O  Extract to stdout
    -m  Don't restore mtime
    -o  Don't restore user:group
    -k  Don't replace existing files
    -Z  (De)compress using compress
    -z  (De)compress using gzip
    -J  (De)compress using xz
    -j  (De)compress using bzip2
    -a  (De)compress using lzma
    -h  Follow symlinks
    -T FILE File with names to include
    -X FILE File with glob patterns to exclude
    --exclude PATTERN   Glob pattern to exclude

Reverting to v2.4.2 for use in Alpine Linux.

realtime-neil commented 3 years ago

Yikes. Okay, I'm remembering #238 and looking at these...

...and I'm wondering what The Right Thing to do is.

realtime-neil commented 3 years ago

This is not the first time the limitations of busybox tools have affected makeself. Readers will recall the problem with busybox dd in #161 .

realtime-neil commented 3 years ago

@megastep this can probably be fixed this like you did in #228; i.e., testing that tar supports the --posix flag before attempting to use it. Although, this re-introduces the problem in #238 --- platforms like Alpine will create GNU archives and platforms like FreeBSD will be (apparently) unable to extract them.

Edit: changed ustar to GNU, because that's what my busybox tar is producing.

ntx-ben commented 3 years ago

Upon further investigation, it seems even previous versions are incompatible with busybox' tar command. The r flag is also invalid.

realtime-neil commented 3 years ago

@ntx-ben you are correct. I guess it would be more appropriate to say makeself on Alpine remains broken since d22a2dafbf6a10ddf45da9f8082929e29d46054e because busybox tar lacks a r option.

I note that the r "operand" is described here: https://pubs.opengroup.org/onlinepubs/007908799/xcu/tar.html

realtime-neil commented 3 years ago

For those wondering why we append rather than create:

The utility named by utility shall be executed one or more times until the end-of-file is reached or the logical end-of file string is found.

The generated command line length shall be the sum of the size in bytes of the utility name and each argument treated as strings, including a null byte terminator for each of these strings. The xargs utility shall limit the command line length such that when the command line is invoked, the combined argument and environment lists (see the exec family of functions in the System Interfaces volume of POSIX.1-2017) shall not exceed {ARG_MAX}-2048 bytes. Within this constraint, if neither the -n nor the -s option is specified, the default command line length shall be at least {LINE_MAX}.

--- https://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html#tag_20_158_03

tldr; the command length limit is allowed to force multiple xargs tar ... invocations.

megastep commented 3 years ago

If we never really had it working properly on Busybox, I'd be OK with keeping the increased POSIX compatibility instead as it is more important. Also this is mostly for generating the archives - it's not a big ask to create the archives on something else than Busybox, as long as they can be extracted there.

rbcrwd commented 3 years ago

I'm still debugging, but this also appears to have broken extraction on Solaris 10 and 11

megastep commented 3 years ago

I'd be surprised if it broke extraction rather than just the creation of archives? Can you confirm? Solaris support is a must for me.

rbcrwd commented 3 years ago

Testing with the following example from #238 executed in the root of the makeself git repo:

./makeself.sh --notemp . makeself.run "Makeself by Stephane Peter" echo "Makeself has extracted itself"

I see the following result on a vanilla Solaris 11 x86 system:

root@solaris11:/tmp# ./makeself.run
+ ./makeself.run
Verifying archive integrity...  100%   MD5 checksums are OK. All good.
Uncompressing Makeself by Stephane Peter  100%   ... Extraction failed.
Terminated
megastep commented 3 years ago

Was the archive also created on the Solaris system?

rbcrwd commented 3 years ago

No, it's created on an OS X system, 10.15.7:

# tar --version
bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6
rbcrwd commented 3 years ago

I'm seeing the same fundamental result when generating the same .run on a Linux VM with GNU tar-1.33.

megastep commented 3 years ago

Mmmh I wonder if some of the other formats might be more compatible, such as ustar, cpio or shar

rbcrwd commented 3 years ago

I've extracted the original tarball (generated on OS X) on Solaris, and have been experimenting with it. It lists just fine, although there are obvious extensions, like the PaxHeader/ directory. However, when extracting the same way you are, I see the following:

# gzip -dc makeself.tar.gz | tar xpvf -
tar: ./.git/PaxHeader/FETCH_HEAD: typeflag 'x' not recognized, converting to regular file
x ./.git/PaxHeader/FETCH_HEAD, 128 bytes, 1 tape blocks
x ./.git/FETCH_HEAD, 211 bytes, 1 tape blocks
tar: ./.git/PaxHeader/HEAD: typeflag 'x' not recognized, converting to regular file
x ./.git/PaxHeader/HEAD, 90 bytes, 1 tape blocks
x ./.git/HEAD, 23 bytes, 1 tape blocks
...

Each file goes on like that, it looks like Solaris doesn't support those type extensions. With the exception of the extra directory the files look fine, but tar's exit code is 1, signifying that it failed. This triggers the failure branch in the UnTAR() function in makeself-header.sh.

rbcrwd commented 3 years ago

I've tested with release-2.4.2 and it doesn't exhibit the same behavior. Although there was a lot of delta between 2.4.2 and 2.4.3, I, like the OP, suspect it was the --posix tar creation flag that, while adding better BSD support, appears to have broken Solaris.

realtime-neil commented 3 years ago

@rbcrwd What's the tar being used on the Solaris host doing the extraction?

realtime-neil commented 3 years ago

@megastep

Mmmh I wonder if some of the other formats might be more compatible, such as ustar, cpio or shar

The last time I went looking, iirc, there's no easy way to get a generic tar utility to select the archive format.

Recent GNU and FreeBSD tar commands certainly sport the --posix flag:

The OpenSolaris tar man page mentions neither the --posix flag nor the POSIX.1-2001 "pax" format. This might imply support for the ustar format only.

References:

rbcrwd commented 3 years ago

On the extraction side, my test Solaris VMs are using whatever's first in the default $PATH:

[edit] Both are non-production systems, just installed with the bare minimum of "whatever came from a CD installation".

rbcrwd commented 3 years ago

Any thoughts on how to proceed? I've a modified local version that suffices for now, but am curious what next steps are, and how I can help.

I recognize that the purpose behind the --posix flag was to increase compatibility, but for what platforms?

megastep commented 3 years ago

The goal is to have Makeself produce archives that are as cross-platform as possible. So if POSIX doesn't achieve that goal, we might need to investigate what works better in practice, like maybe ustar - I'm not as concerned with the ability to produce archives on all supported platforms, mostly about their ability to be extracted correctly.

rbcrwd commented 3 years ago

Understood on the goal and compatibility, I'm just curious what OSes were targeted (if any particular ones) with the --posix addition. Specifically, I'm wondering how I might help address those, since it appears to have had a moderately opposite effect.

realtime-neil commented 3 years ago

@rbcrwd the OS in question was FreeBSD --- that story is told here: https://github.com/megastep/makeself/issues/238

tldr; tar on some (unnamed) Linux made a ustar archive the FreeBSD tar couldn't extract.

IMHO, busybox has a demonstrated history (ahem, https://github.com/megastep/makeself/pull/228) of twiddling commands into POSIX non-compliance. I'm sure there's good reasons for doing so, but the result is fundamentally at odds with the goals of the Makeself project. @megastep please correct me here if I'm overreaching.

megastep commented 3 years ago

Correct; to be fair we can always find some way to work around weird command syntax in the scripts, however it's more of a problem if the raw tar data is not handled by the OS. So that's the limitation we need to work around here so we can produce tar archives that can be extracted on the wide variety of Unix/Linux/BSD systems targeted, ideally without requiring third-party tools to be installed (such as GNU tar)

rbcrwd commented 3 years ago

Thanks for the context.

What about using --format to specify either ustar (more desirable, acknowledging the prior break) or v7 (more limited format)? I see that both GNU Tar and the Mac BSD tar support both of those options, although they're both coy about what their default is, at least in the man-page. That would hopefully keep an overeager GNU tar from defaulting to even more recent formats (e.g., gnutar).

Given that the pax format that --posix flag produce is an extension of the older ustar, it makes sense that the objectively ancient Solaris family doesn't support it. I see this in that I can hand-extract the tarballs, but the extra type flags and metadata files cause the extraction to fail.

Finally, sorry for apparently hijacking a BusyBox ticket. I allowed myself to rush too much.

realtime-neil commented 3 years ago

@rbcrwd how do you feel about a --tar-format flag that populates a TAR_FORMAT_FLAG which defaults to --posix? Makeself would still produce PAX archives by default, but it could be overriden by the creator of a self-extracting archive when targeting recipients lacking a tar with PAX support.

rbcrwd commented 3 years ago

I could work with specifying the tar format, but unless you have outside context, the details in #238 don't seem to support the assertion that a ustar format archive was to blame on BSD (or even where @nunotexbsd generated their tarball that failed on BSD). I'd be quite surprised if ustar was to blame, as it's pretty much universal in all tar implementations I'm aware of (even some hoary old private C++ white-box implementations I work with).

For that matter, since at least 2003 GNU tar has defaulted to the GNU format. That seems to imply that Linux->BSD .run was just silently broken since then, but that's another rabbit-hole I have yet to dig through. Given that and BusyBox's similar default, that seems to stand a much higher likelihood of being the cause of #238.

All that to say: sure, I'm good with a --tar-format flag, but think it should default to the more-compatible ustar rather than pax. I don't think makeself is even exposing the extended attributes that the pax format adds, so am unsure what benefits it would provide.

realtime-neil commented 3 years ago

@rbcrwd point taken; ustar archives are our legacy and it makes sense to choose that format by default. I'm going to start work on a --tar-format flag.

I'm also going to put the test units through my FreeBSD VM --- I'd be similarly surprised if the libarchive tar shipping with such an established unix didn't support ustar.

megastep commented 3 years ago

As a side note, we might be able to run CI tests on FreeBSD via this action: https://github.com/marketplace/actions/freebsd-vm

One of the things I'd like to test for is portability between OSes, so maybe export a test archive as an artifact on Linux and try to extract it on a FreeBSD VM.