MetricMike / asdf-awscli

MIT License
54 stars 17 forks source link

Bundled installs fail with coreutils `cp` #27

Closed oeuftete closed 1 year ago

oeuftete commented 1 year ago

Describe the bug

If coreutils cp has been set up as the preferred command for cp, asdf installs fail. This maybe is the problem of the user who chose to do this, but possibly there's an agnostic use of cp that could work for both cases, or the script added in #21 could try to detect which cp is in use.

Steps to reproduce

❯ which -a cp
/usr/local/opt/coreutils/libexec/gnubin/cp
/bin/cp

❯ asdf install awscli 2.11.18
asdf-awscli: Expected /Users/me/.asdf/installs/awscli/2.11.18/bin/aws to be executable.
asdf-awscli: An error ocurred while installing awscli 2.11.18.

❯ export PATH=/bin:$PATH
❯ which -a cp
/bin/cp
/usr/local/opt/coreutils/libexec/gnubin/cp
/bin/cp

❯ asdf install awscli 2.11.18
mkdir: /Users/me/.asdf/downloads/awscli/2.11.18: File exists
asdf-awscli: asdf-awscli 2.11.18 installation was successful!

Expected behavior

Expected this to work.

Additional context

The issue is with different behaviour of cp -a. With coreutils, the directory b/ is preserved in the destination when doing a cp -a a/b/ c, with the system cp on MacOS, it is not. Example.

❯ ls -1 bar/foo/test.txt 
bar/foo/test.txt

❯ mkdir baz quux
❯ which -a cp
/usr/local/opt/coreutils/libexec/gnubin/cp
/bin/cp

❯ cp -a bar/foo/ baz
❯ /bin/cp -a bar/foo/ quux

❯ ls -1 baz/
foo
❯ ls -1 baz/foo/
test.txt

❯ ls -1 quux/
test.txt
MetricMike commented 1 year ago

ah yikes.

Could you post the results of cp --version or the very last line of man cp? That'll help narrow down exactly which version of cp we need to guard against.

I mostly use cp -a as shorthand instead of cp -r, so the short term fix can probably swap them out. I suspect this is because of a very old version of coreutils and we should also tag https://github.com/asdf-vm/asdf/issues/1435 with what we find out :)

kr3cj commented 1 year ago

This is affecting me as well.

whence cp
/opt/homebrew/opt/coreutils/libexec/gnubin/cp
cp --version
cp (GNU coreutils) 9.3
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbjorn Granlund, David MacKenzie, and Jim Meyering.
elementalvoid commented 1 year ago

^^ Ditto. @kr3cj and I are on the same version. Here's the last line of manpage (/opt/homebrew/opt/coreutils/share/man/man1/gcp.1):

GNU coreutils 9.3          April 2023           CP(1)

The manpage has this to say about -a:

       -a, --archive
              same as -dR --preserve=all
       -d     same as --no-dereference --preserve=links
       -R, -r, --recursive
              copy directories recursively
       --preserve[=ATTR_LIST]
              preserve the specified attributes

I suspect this is because of a very old version of coreutils

Here is the relevant parts of the brew installation info for coreutils:

❯ brew info coreutils
==> coreutils: stable 9.3 (bottled), HEAD
GNU File, Shell, and Text utilities
https://www.gnu.org/software/coreutils
/opt/homebrew/Cellar/coreutils/9.3 (476 files, 13.4MB) *
  Poured from bottle using the formulae.brew.sh API on 2023-04-20 at 16:49:06
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/coreutils.rb

I'm on the latest available version.

MetricMike commented 1 year ago

Can you try with asdf plugin update awscli macos-coreutils-pr-27? That switches from cp -a to cp -r.

The coreutils changelog looks like it has some macos specific implementation (e.g., https://github.com/coreutils/coreutils/blob/master/NEWS#L407-L410) but I'm not sure exactly where the difference is.

elementalvoid commented 1 year ago

I tried the new branch but it didn't work. I dug in a bit deeper and found the real problem!

It's the symlink that's no good...

I added ls -l and file on ${install_path}/aws:

@@ -1,4 +1,5 @@
 #!/usr/bin/env bash
+set -x

 set -euo pipefail

@@ -112,6 +113,8 @@ install_v2_macos_bundled_installer() {
        cp -r "${download_path}/tmp-awscliv2/aws-cli.pkg/Payload/aws-cli/" "${install_path}"
        ln -snf "${install_path}/aws" "${install_path}/bin/aws"
        ln -snf "${install_path}/aws_completer" "${install_path}/bin/aws_completer"
+       ls -l "${install_path}/bin/aws"
+       file "${install_path}/bin/aws"
        rm -rf "${download_path}/tmp-awscliv2"
 }

With this we can see:

+ file /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws
/Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws: broken symbolic link to /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/aws

Edit: Striking this as I realized after posting that the bin dir is for the plugin install dir, not from the package source dir. ~It looks like they ditched the bin dir:~


Below are some logs with more context...

+ major_version=2
+ [[ 2 = \1 ]]
+ [[ 2 = \2 ]]
+ [[ Darwin = \L\i\n\u\x ]]
+ [[ Darwin = \D\a\r\w\i\n ]]
+ install_v2_macos_bundled_installer /Users/matt.klich/.local/share/rtx/downloads/awscli/2.11.25 /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25
+ local download_path install_path
+ download_path=/Users/matt.klich/.local/share/rtx/downloads/awscli/2.11.25
+ install_path=/Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25
+ mkdir -p /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin
+ pkgutil --expand-full /Users/matt.klich/.local/share/rtx/downloads/awscli/2.11.25/AWSCLIV2.pkg /Users/matt.klich/.local/share/rtx/downloads/awscli/2.11.25/tmp-awscliv2
+ cp -r /Users/matt.klich/.local/share/rtx/downloads/awscli/2.11.25/tmp-awscliv2/aws-cli.pkg/Payload/aws-cli/ /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25
+ ln -snf /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/aws /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws
+ ln -snf /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/aws_completer /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws_completer
+ ls -l /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws
lrwxr-xr-x 1 matt.klich staff 62 2023-06-07 11:05 /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws -> /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/aws
+ file /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws
/Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws: broken symbolic link to /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/aws
+ rm -rf /Users/matt.klich/.local/share/rtx/downloads/awscli/2.11.25/tmp-awscliv2
+ local tool_cmd
++ echo 'aws --version'
++ cut '-d ' -f1
+ tool_cmd=aws
+ test -x /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws
+ fail 'Expected /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws to be executable.'
+ echo -e 'asdf-awscli: Expected /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws to be executable.'
asdf-awscli: Expected /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25/bin/aws to be executable.
+ exit 1
+ rm -rf /Users/matt.klich/.local/share/rtx/installs/awscli/2.11.25
elementalvoid commented 1 year ago

I didn't have time to do much more during the day. But I've been looking tonight. bisect narrowed it down to #21 being when the install on (at least ARM) Mac started failing.

I did reproduce the original report which is that cp from coreutils fails while the default OSX cp works. I even figured out why (ugh, gross).

But, that said, this isn't the only tool that I've used (or built) that has issues with one over the other. It might be best to force the plugin to use the OSX default cp... Doing so would avoid this particular issue and future potential gotchas. Up to you of course.

It's worth pointing out that coreutils comes with lots of "duplicate" binaries. Any of which may act differently than the system version.

❯ ls -1 /opt/homebrew/opt/coreutils/libexec/gnubin | wc -l
106

So, why does gnu / coreutils cp fail? Because it behaves differently depending on whether the destination directory exists when it is run.

# let's make sure we start clean
❯ rm -rf gnu-dir-exists gnu-dir-does-not-exist

# let's force a directory to exist
❯ mkdir gnu-dir-exists

# now copy files to it
❯ /opt/homebrew/opt/coreutils/libexec/gnubin/cp -a foo/aws-cli.pkg/Payload/aws-cli/ gnu-dir-exists

# now, copy the same files using the same args to a directory that _does not_ exist
❯ /opt/homebrew/opt/coreutils/libexec/gnubin/cp -a foo/aws-cli.pkg/Payload/aws-cli/ gnu-dir-does-not-exist

# we can see these two copies result in _DIFFERENT_ results!!!
❯ ls -l gnu-dir-exists gnu-dir-does-not-exist
gnu-dir-exists:
Permissions Size User       Group Date Modified    Name
drwxr-xr-x     - matt.klich staff 2023-06-02 11:19  aws-cli/

gnu-dir-does-not-exist:
Permissions Size User       Group Date Modified    Name
drwxr-xr-x     - matt.klich staff 2023-06-02 11:19 awscli/
drwxr-xr-x     - matt.klich staff 2023-06-02 11:19 cryptography/
drwxr-xr-x     - matt.klich staff 2023-06-02 11:19 docutils/
drwxr-xr-x     - matt.klich staff 2023-06-02 11:19 lib-dynload/
.rwxr-xr-x  1.8M matt.klich staff 2023-06-02 11:19 _awscrt.cpython-311-darwin.so*
.rwxr-xr-x  225k matt.klich staff 2023-06-02 11:19 _cffi_backend.cpython-311-darwin.so*
.rwxr-xr-x  421k matt.klich staff 2023-06-02 11:19 _ruamel_yaml.cpython-311-darwin.so*
.rwxr-xr-x  6.7M matt.klich staff 2023-06-02 11:19 aws*
.rwxr-xr-x  6.7M matt.klich staff 2023-06-02 11:19 aws_completer*
.rwxr-xr-x  1.8M matt.klich staff 2023-06-02 11:19 base_library.zip*
.rwxr-xr-x  2.9M matt.klich staff 2023-06-02 11:19 libcrypto.1.1.dylib*
.rwxr-xr-x  594k matt.klich staff 2023-06-02 11:19 libssl.1.1.dylib*
.rwxr-xr-x  7.2M matt.klich staff 2023-06-02 11:19 Python*

When the target directory already exists, GNU cp copies the source directory itself into the target. However, if the target does not exist it copes the source directory's contents into the target.

This is what causes the symlink to be bad.

kr3cj commented 1 year ago

That is unexpected behavior. Seems like a simple fix would be to just replace cp with /bin/cp when uname is Darwin

MetricMike commented 1 year ago

I disagree with the notion that it's unexpected behavior. The docs for macOS's /bin/cp are quite clear - it's not unexpected so much as unstandard behavior.

Given that macos has been steadily removing their BSD internals, I'm very uncomfortable hardcoding behavior to their system /bin/cp, when that may drastically change with macOS15. I have zero visibility into when or why macos will touch /bin/cp and that's not a topic I want to tackle.

If that's not something you/your company are in favor of, I hope this is easy enough to fork against and I welcome PRs that make that easier going forward! Avoiding darwin-specific behavior is going to be my favored response given how weird macOS is lately and the prohibitive cost of testing the various flavors. It is extraordinarily frustrating having to deal with edge cases from a platform I can't test without spending $50 dollars a day on a test instance with AWS because I don't happen to own a macbook.

I'm happy to expand my dev/testing if you and/or your company is willing to sponsor this repo.

There's also no guarantee that this behavior won't hit Linux-based OSes (both Fedora and Debian upstream have coreutils 9, and while the current tests don't show it, I wouldn't be surprised if a similar error showed up next year on a CentOS/RockyLinux/Ubuntu). One of the most compelling arguments to me on the main asdf repo for POSIX compatibility is that it's just wildly hard to try to maintain compatibility across consumer systems that have no incentive to maintain that same compatibility and it's a folly to try, so the only sensible action is to target the POSIX standard. That macOS explicitly breaks POSIX compatibility because /bin/cp has been frozen since 2003 isn't reasonable for me to defend against.

The most recent GH-Action runs suggest that coreutils-9 isn't the actual cause, so versionlocking wouldn't help either.

MetricMike commented 1 year ago

To be clear, I'm still working on this and hope to have a solution soon

elementalvoid commented 1 year ago

Given that macos has been steadily removing their BSD internals, I'm very uncomfortable hardcoding behavior to their system /bin/cp, when that may drastically change with macOS15. I have zero visibility into when or why macos will touch /bin/cp and that's not a topic I want to tackle.

Fair point. I suggested the opposite because I've been bitten by a lack of consistency with regard to Mac users having coreutils installed. Personally, I prefer the coreutils versions for the very reasons you state but, as they aren't always available, I've chosen to use the Mac versions.

Avoiding darwin-specific behavior is going to be my favored response given how weird macOS is lately

Thanks for sharing your thoughts on what you'd prefer in the repository. Knowing this can help ease contributions.

elementalvoid commented 1 year ago

What about mv instead of cp?

The following works locally (tested both with and without coreutils):

diff --git a/bin/install b/bin/install
index 3966c98..956aed4 100755
--- a/bin/install
+++ b/bin/install
@@ -109,9 +109,9 @@ install_v2_macos_bundled_installer() {

        mkdir -p "${install_path}/bin"
        pkgutil --expand-full "${download_path}/AWSCLIV2.pkg" "${download_path}/tmp-awscliv2"
-       cp -a "${download_path}/tmp-awscliv2/aws-cli.pkg/Payload/aws-cli/" "${install_path}"
-       ln -snf "${install_path}/aws" "${install_path}/bin/aws"
-       ln -snf "${install_path}/aws_completer" "${install_path}/bin/aws_completer"
+       mv "${download_path}/tmp-awscliv2/aws-cli.pkg/Payload/aws-cli" "${install_path}"
+       ln -snf "${install_path}/aws-cli/aws" "${install_path}/bin/aws"
+       ln -snf "${install_path}/aws-cli/aws_completer" "${install_path}/bin/aws_completer"
        rm -rf "${download_path}/tmp-awscliv2"
 }
MetricMike commented 1 year ago

29 has working test coverage for this now, and 1.0.1 has been released with the update