kubernetes-sigs / krew

📦 Find and install kubectl plugins
https://krew.sigs.k8s.io
Apache License 2.0
6.42k stars 369 forks source link

Krew stopped working after same plugin get installed and uninstalled twice #735

Closed donorp closed 2 years ago

donorp commented 3 years ago

Hi, I was trying to automate the kubectl plugin releasing process within a custom krew-index repo, thus I need to test plugin installation as a part of the automation process to make sure everything's working.

Environment

Host OS: ubuntu-20.04 (github actions)

Krew version:

+ /opt/hostedtoolcache/krew/v0.4.2/amd64/krew version
OPTION            VALUE
GitTag            v0.4.2
GitCommit         6fcdb79
IndexURI          https://github.com/kubernetes-sigs/krew-index.git
BasePath          /home/runner/.krew
IndexPath         /home/runner/.krew/index/default
InstallPath       /home/runner/.krew/store
BinPath           /home/runner/.krew/bin
DetectedPlatform  linux/amd64

Could not reproduce this on local macOS.

What you did?

Run krew install and krew uninstall for every os/arch set of a kubectl plugin (kubectl-convert in the following example)

krew uninstall convert >/dev/null 2>&1 || true

KREW_OS="linux" \
KREW_ARCH="amd64" \
krew install \
  --manifest="plugins/convert.yaml" \
  --archive="build/archive/kubectl-convert.linux.amd64.tar.gz"

krew uninstall convert

Expected behavior

krew only raise error when manifest or plugin package is invalid.

What went wrong?

krew stopped working after two installation & uninstallation of the same plugin:

--- :local: [  ] { arch: amd64, kernel: windows }
WARNING: Detected stdin, but discarding it because of --manifest or args
Installing plugin: convert
Installed plugin: convert
\
 | Use this plugin:
 |  kubectl convert
 | Documentation:
 |  https://github.com/kubernetes/kubernetes
/
Uninstalled plugin: convert
DONE :local: [  ] { arch: amd64, kernel: windows }
--- :local: [  ] { arch: arm64, kernel: linux }
WARNING: Detected stdin, but discarding it because of --manifest or args
Installing plugin: convert
Installed plugin: convert
\
 | Use this plugin:
 |  kubectl convert
 | Documentation:
 |  https://github.com/kubernetes/kubernetes
/
Uninstalled plugin: convert
This version of Krew is not supported anymore. Please manually migrate:
1. Uninstall Krew: https://krew.sigs.k8s.io/docs/user-guide/setup/uninstall/
2. Install latest Krew: https://krew.sigs.k8s.io/docs/user-guide/setup/install/
3. Install the plugins you used
F1031 09:02:58.332421   50318 root.go:79] krew home outdated
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0xc0000b2001, 0xc0000abdc0, 0x3d, 0x40)
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:1021 +0xb9
k8s.io/klog/v2.(*loggingT).output(0xea5d20, 0xc000000003, 0x0, 0x0, 0xc00017bce0, 0xc12ae0, 0x7, 0x4f, 0x40e000)
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:970 +0x191
k8s.io/klog/v2.(*loggingT).printDepth(0xea5d20, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0x1, 0xc000178420, 0x1, 0x1)
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:733 +0x16f
k8s.io/klog/v2.(*loggingT).print(...)
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:715
k8s.io/klog/v2.Fatal(...)
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:1489
sigs.k8s.io/krew/cmd/krew/cmd.Execute()
    /home/runner/work/krew/krew/cmd/krew/cmd/root.go:79 +0x254
main.main()
    /home/runner/work/krew/krew/cmd/krew/main.go:25 +0x45

goroutine 18 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0xea5d20)
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:1164 +0x8b
created by k8s.io/klog/v2.init.0
    /home/runner/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:418 +0xdf

goroutine 19 [runnable]:
vendor/golang.org/x/net/http/httpproxy.(*config).init(0xc000214000)
    /opt/hostedtoolcache/go/1.16.8/x64/src/vendor/golang.org/x/net/http/httpproxy/proxy.go:208 +0x839
vendor/golang.org/x/net/http/httpproxy.(*Config).ProxyFunc(...)
    /opt/hostedtoolcache/go/1.16.8/x64/src/vendor/golang.org/x/net/http/httpproxy/proxy.go:123
net/http.envProxyFunc.func1()
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/transport.go:815 +0x85
sync.(*Once).doSlow(0xed4da8, 0xac9740)
    /opt/hostedtoolcache/go/1.16.8/x64/src/sync/once.go:68 +0xec
sync.(*Once).Do(...)
    /opt/hostedtoolcache/go/1.16.8/x64/src/sync/once.go:59
net/http.envProxyFunc(0xaab719)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/transport.go:814 +0x59
net/http.ProxyFromEnvironment(0xc000074100, 0xc0000262d0, 0x12, 0xa7c300)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/transport.go:438 +0x25
net/http.(*Transport).connectMethodForRequest(0xe9b200, 0xc00001e080, 0x0, 0xaab711, 0x5, 0xc0000262d0, 0x12, 0x0, 0x0, 0x0)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/transport.go:830 +0xe4
net/http.(*Transport).roundTrip(0xe9b200, 0xc000074100, 0x30, 0xa5e2e0, 0x7fce16acff00)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/transport.go:569 +0x733
net/http.(*Transport).RoundTrip(0xe9b200, 0xc000074100, 0xe9b200, 0x0, 0x0)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/roundtrip.go:17 +0x35
net/http.send(0xc000074100, 0xb4e9c0, 0xe9b200, 0x0, 0x0, 0x0, 0xc00000e020, 0x203000, 0x1, 0x0)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/client.go:251 +0x454
net/http.(*Client).send(0xea54e0, 0xc000074100, 0x0, 0x0, 0x0, 0xc00000e020, 0x0, 0x1, 0xc000074100)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/client.go:175 +0xff
net/http.(*Client).do(0xea54e0, 0xc000074100, 0x0, 0x0, 0x0)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/client.go:717 +0x45f
net/http.(*Client).Do(...)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/client.go:585
net/http.(*Client).Get(0xea54e0, 0xaab711, 0x41, 0x0, 0x0, 0x0)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/client.go:474 +0xbe
net/http.Get(...)
    /opt/hostedtoolcache/go/1.16.8/x64/src/net/http/client.go:446
sigs.k8s.io/krew/cmd/krew/cmd/internal.FetchLatestTag(0x0, 0x0, 0x0, 0x0)
    /home/runner/work/krew/krew/cmd/krew/cmd/internal/fetch_tag.go:35 +0xf2
sigs.k8s.io/krew/cmd/krew/cmd.preRun.func1()
    /home/runner/work/krew/krew/cmd/krew/cmd/root.go:133 +0x1cf
created by sigs.k8s.io/krew/cmd/krew/cmd.preRun
    /home/runner/work/krew/krew/cmd/krew/cmd/root.go:124 +0x445

full logs: logs_22.zip

ahmetb commented 3 years ago

I've looked at the logs (specifically 9_Build and Push kubectl-convert.txt) and nothing jumps to me. I've also tested this locally as follows:

cd krew-index/plugins
KREW_OS="linux" KREW_ARCH="amd64" kubectl krew install --manifest=ctx.yaml
kubectl krew uninstall ctx
KREW_OS="linux" KREW_ARCH="amd64" kubectl krew install --manifest=ctx.yaml
kubectl krew uninstall ctx
KREW_OS="linux" KREW_ARCH="amd64" kubectl krew install --manifest=ctx.yaml
kubectl krew uninstall ctx

and it does not reproduce for me. Does it only happen on your build environment? Or can you also locally reproduce this issue? I don't have a theory on what would cause this other than something's messing with the filesystem somehow.

donorp commented 3 years ago

I cannot reproduce this on my local machine as well, only github actions have this issue.

Wild guess: one possible cause of this is the timing jitter in github virtual environment, as to my very limited experience, that's far more significant than the usual case.

ahmetb commented 3 years ago

Are you installing/uninstalling them in parallel? I can't see why a sequential install/uninstall would cause this.

donorp commented 3 years ago

I wish I could say yes, but there is no parallel job in my pipeline, to make sure they are sequential commands, I created a github action to run krew install/uninstall in script, but still no luck: https://github.com/arhat-dev/krew-index/runs/4170194680

Action logs:

10_Test krew redundent installuninstall.txt

ahmetb commented 3 years ago

I think I have a theory.

https://github.com/kubernetes-sigs/krew/blob/master/internal/receiptsmigration/migration.go

If there are plugins installed (maybe krew itself!) but there are no installation receipts, we give that "krew home outdated" error.

Is it possible to run "tree $HOME/.krew" before the failing command on GitHub actions? That would greatly help us debug how your setup got into that situation.

/kind bug /priority P3

donorp commented 3 years ago

You are right, the receipts dir is always empty before and after the crash, and the plugin is not removed after krew uninstall.

+ krew -v 10 uninstall convert
I1111 05:05:55.291830   21668 root.go:221] Ensure creating dir: "/home/runner/.krew"
I1111 05:05:55.291873   21668 root.go:221] Ensure creating dir: "/home/runner/.krew/store"
I1111 05:05:55.291879   21668 root.go:221] Ensure creating dir: "/home/runner/.krew/bin"
I1111 05:05:55.291883   21668 root.go:221] Ensure creating dir: "/home/runner/.krew/index"
I1111 05:05:55.291905   21668 root.go:221] Ensure creating dir: "/home/runner/.krew/receipts"
I1111 05:05:55.291998   21668 migration.go:30] Checking if index migration is needed.
I1111 05:05:55.292006   21668 migration.go:33] Index already migrated.
I1111 05:05:55.292024   21668 uninstall.go:47] Going to uninstall plugin convert
I1111 05:05:55.292052   21668 install.go:163] Finding installed version to delete
I1111 05:05:55.293208   21668 root.go:128] skipping upgrade check
I1111 05:05:55.293370   21668 install.go:172] Deleting plugin convert
I1111 05:05:55.293511   21668 install.go:175] Unlink "/home/runner/.krew/bin/kubectl-convert"
I1111 05:05:55.293531   21668 install.go:215] No file found at "/home/runner/.krew/bin/kubectl-convert"
I1111 05:05:55.293538   21668 install.go:181] Deleting path "/home/runner/.krew/store/convert"
I1111 05:05:55.302120   21668 install.go:186] Deleting plugin receipt "/home/runner/.krew/receipts/convert.yaml"
Uninstalled plugin: convert
I1111 05:05:55.302378   21668 root.go:176] Upgrade check was skipped or has not finished
+ krew -v 10 uninstall convert
+ tree /home/runner/.krew
/home/runner/.krew
├── bin
│   └── kubectl-convert.exe -> /home/runner/.krew/store/convert/v1.22.2/kubectl-convert
├── index
│   └── default
│       ├── CONTRIBUTING.md
│       ├── LICENSE
│       ├── OWNERS
│       ├── OWNERS_ALIASES
│       ├── README.md
│       ├── SECURITY_CONTACTS
│       ├── code-of-conduct.md
│       ├── plugins
... omitted ...
│       │   └── whoami.yaml
│       └── plugins.md
├── receipts
└── store

I tried removing windows installation test (on linux) and everything's working (for linux and darwin), and also tried testing windows installation on windows, worked as well.

So I guess it's something related to the KREW_OS env, and probably it's caused by the cleanup behavior that only happens on windows platform: https://github.com/kubernetes-sigs/krew/blob/master/cmd/krew/cmd/root.go#L165 .

Maybe a quick fix for this would be using runtime.GOOS rather than KREW_OS env to determine host os in this case?

ahmetb commented 3 years ago

I agree with your findings, although we need to look closely into implications of that (we might be relying on usage of KREW_OS for testing etc).

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/krew/issues/735#issuecomment-1100761324): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.