Closed maggie44 closed 11 months ago
Seems pretty cut and dried:
INFO[0000] Checking if S3 bucket k3-staging-etcd exists
WARN[0000] Unable to initialize S3 client: Access Denied.
Ideally we wouldn't crash if we do not have valid credentials or the correct permissions, but it seems like something that you can work around easily enough for now by correcting your configuration.
Absolutely, I am not worried about the access denied issue, just the panic.
You mentioned you were still trying to narrow down the cause, I don't think there's any mystery there - access denied error is causing the panic because we don't handle it properly.
Until we address the panic, fix the credentials or disable s3.
You mentioned you were still trying to narrow down the cause, I don't think there's any mystery there - access denied error is causing the panic because we don't handle it properly.
Until we address the panic, fix the credentials or disable s3.
I meant the cause of the panic (i.e. why the issue isn't handled), not the scenario that leads to a panic. But seems there is no need, I'm not used to such snappy responses. Happy to leave you to it and will just get back to fixing my credentials.
Thanks.
This can also be reproduced just by setting --etcd-s3-insecure=true
to use http when the endpoint is using https, or vice versa. Basically any failure to validate access to the bucket.
WARN[0004] Unable to initialize S3 client: Head "https://localhost:9090/test/": http: server gave HTTP response to HTTPS client
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x41f4339]
root@k3s-server-1:/# k3s etcd-snapshot save --s3 --s3-endpoint=s3.example.com --s3-access-key=k3s --s3-secret-key=invalid --s3-bucket=invalid
INFO[0000] Saving etcd snapshot to /var/lib/rancher/k3s/server/db/snapshots/on-demand-k3s-server-1-1700603492
{"level":"info","ts":"2023-11-21T21:51:31.835391Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/var/lib/rancher/k3s/server/db/snapshots/on-demand-k3s-server-1-1700603492.part"}
{"level":"info","ts":"2023-11-21T21:51:31.837266Z","logger":"client","caller":"v3@v3.5.9-k3s1/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2023-11-21T21:51:31.837328Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":"2023-11-21T21:51:31.854271Z","logger":"client","caller":"v3@v3.5.9-k3s1/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2023-11-21T21:51:31.863942Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"3.3 MB","took":"now"}
{"level":"info","ts":"2023-11-21T21:51:31.864014Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/lib/rancher/k3s/server/db/snapshots/on-demand-k3s-server-1-1700603492"}
INFO[0000] Checking if S3 bucket invalid exists
WARN[0000] Unable to initialize S3 client: Access Denied.
INFO[0000] Reconciling ETCDSnapshotFile resources
INFO[0000] Checking if S3 bucket invalid exists
WARN[0000] Unable to initialize S3 client: Access Denied.
INFO[0000] Reconciliation of ETCDSnapshotFile resources complete
FATA[0000] Access Denied.
root@k3s-server-1:/# kubectl get etcdsnapshotfile s3-on-demand-k3s-server-1-1700603492-41242b -o yaml
apiVersion: k3s.cattle.io/v1
kind: ETCDSnapshotFile
metadata:
creationTimestamp: "2023-11-21T21:51:31Z"
finalizers:
- wrangler.cattle.io/managed-etcd-snapshots-controller
generation: 1
labels:
etcd.k3s.cattle.io/snapshot-storage-node: s3
name: s3-on-demand-k3s-server-1-1700603492-41242b
resourceVersion: "1553"
uid: 2c2f54d4-f0db-400c-82b6-831c4908059a
spec:
location: ""
nodeName: s3
s3:
bucket: invalid
endpoint: s3.example.com
region: us-east-1
snapshotName: on-demand-k3s-server-1-1700603492
status:
creationTime: "2023-11-21T21:51:32Z"
error:
message: Access Denied.
time: "2023-11-21T21:51:32Z"
readyToUse: false
size: "0"
Infrastructure
Node(s) CPU architecture, OS, and Version:
Ubuntu 22.04
Cluster Configuration:
1 server
Config.yaml:
write-kubeconfig-mode: 644
token: <token>
cluster-init: true
node-name: server1
$ sudo mkdir -p /etc/rancher/k3s && sudo cp config.yaml /etc/rancher/k3s
k3s etcd snapshot save
on s3 with s3 prop as invalid data
sudo k3s etcd-snapshot save --s3 --s3-endpoint="invalid" --s3-bucket="invalid" --s3-folder="invalid" --s3-access-key="invalid" --s3-secret-key="invalid" --s3-region="invalid"
Replication Results:
k3s version v1.28.3+k3s2 (bbafb86e)
go version go1.20.10
...
WARN[0000] Unable to initialize S3 client: Access Denied.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x41be1b9]
goroutine 1 [running]: github.com/k3s-io/k3s/pkg/etcd.(S3).snapshotRetention(0xc00121a349?, {0x6565e08?, 0xc000a09770?}) /go/src/github.com/k3s-io/k3s/pkg/etcd/s3.go:284 +0x59 github.com/k3s-io/k3s/pkg/etcd.(ETCD).Snapshot(0xc000a097c0, {0x6565e08, 0xc000a09770}) /go/src/github.com/k3s-io/k3s/pkg/etcd/snapshot.go:375 +0x13ca github.com/k3s-io/k3s/pkg/cli/etcdsnapshot.save(0xc00088f340, 0xc000c1d970?) /go/src/github.com/k3s-io/k3s/pkg/cli/etcdsnapshot/etcd_snapshot.go:121 +0x92 github.com/k3s-io/k3s/pkg/cli/etcdsnapshot.Save(0xc00088f340?) /go/src/github.com/k3s-io/k3s/pkg/cli/etcdsnapshot/etcd_snapshot.go:104 +0x45 github.com/urfave/cli.HandleAction({0x4ec5ea0?, 0x5e149a0?}, 0x4?) /go/pkg/mod/github.com/urfave/cli@v1.22.14/app.go:524 +0x50 github.com/urfave/cli.Command.Run({{0x597fc1c, 0x4}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x5a2f781, 0x22}, {0x0, ...}, ...}, ...) /go/pkg/mod/github.com/urfave/cli@v1.22.14/command.go:175 +0x67b github.com/urfave/cli.(App).RunAsSubcommand(0xc0006ae540, 0xc00088f080) /go/pkg/mod/github.com/urfave/cli@v1.22.14/app.go:405 +0xe87 github.com/urfave/cli.Command.startApp({{0x59a2a52, 0xd}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...) /go/pkg/mod/github.com/urfave/cli@v1.22.14/command.go:380 +0xb7f github.com/urfave/cli.Command.Run({{0x59a2a52, 0xd}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...) /go/pkg/mod/github.com/urfave/cli@v1.22.14/command.go:103 +0x845 github.com/urfave/cli.(App).Run(0xc0006ae380, {0xc00088ef20, 0xd, 0x16}) /go/pkg/mod/github.com/urfave/cli@v1.22.14/app.go:277 +0xb87 main.main() /go/src/github.com/k3s-io/k3s/cmd/server/main.go:81 +0xc1e
**Validation Results:**
- k3s version used for validation:
<!-- Provide the result of k3s -v -->
k3s version v1.28.4-rc1+k3s1 (3f237230) go version go1.20.11
<!-- Provide all the observations -->
- Invalid s3 prop (accesskey/bucket-name)
...
{"level":"info","ts":"2023-11-22T22:05:47.709325Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/lib/rancher/k3s/server/db/snapshots/on-demand-server1-1700690748"}
INFO[0000] Checking if S3 bucket
$ kubectl get etcdsnapshotfile | grep s3-on-demand-server1-1700690748 s3-on-demand-server1-1700690748-41242b on-demand-server1-1700690748 s3 0 2023-11-22T22:05:48Z
$ kubectl get etcdsnapshotfile s3-on-demand-server1-1700690748-41242b -o yaml apiVersion: k3s.cattle.io/v1 kind: ETCDSnapshotFile metadata: creationTimestamp: "2023-11-22T22:05:47Z" finalizers:
I am seeing a panic at midnight every night that is making my cluster fall over. I can reproduce the panic by running
k3s etcd-snapshot save
. Still trying to narrow down the cause and reproduce in different environments, will be sure to keep the ticket up to date and any input or ideas welcome.Environmental Info: K3s Version:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration: Single control plane being used as a control plane and node for development
Describe the bug:
Relevant config file entries: