etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.38k stars 9.72k forks source link

etcdctl outputs regular log messages on /dev/stderr #17464

Open StCyr opened 6 months ago

StCyr commented 6 months ago

Bug report criteria

What happened?

I run

etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /var/lib/etcd/backups/etcd-$(date +"%Y-%m-%d").db >/dev/null

I get this:

{"level":"info","ts":"2024-02-20T12:47:41.926Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/var/lib/etcd/backups/etcd-2024-02-20.db.part"}
{"level":"info","ts":"2024-02-20T12:47:41.937Z","logger":"client","caller":"v3@v3.5.7/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2024-02-20T12:47:41.937Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://[127.0.0.1]:2379"}
{"level":"info","ts":"2024-02-20T12:47:42.239Z","logger":"client","caller":"v3@v3.5.7/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2024-02-20T12:47:42.351Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://[127.0.0.1]:2379","size":"22 MB","took":"now"}
{"level":"info","ts":"2024-02-20T12:47:42.356Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/lib/etcd/backups/etcd-2024-02-20.db"}
Snapshot saved at /var/lib/etcd/backups/etcd-2024-02-20.db

What did you expect to happen?

I was expecting no output

How can we reproduce it (as minimally and precisely as possible)?

run

etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /var/lib/etcd/backups/etcd-$(date +"%Y-%m-%d").db >/dev/null

Anything else we need to know?

This is problematic because in order to have a quiet run I need to run:

etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /var/lib/etcd/backups/etcd-$(date +"%Y-%m-%d").db >/dev/null 2>&1

which will suppress error messages too.

I believe programs should adhere to the following:

  1. no redirection: ouptputs standard logs and error logs
  2. /dev/null: outputs error logs

  3. /dev/null 2>&1: outputs nothing

Now, etcdctl does the following:

  1. no redirection: ouptputs standard logs and error logs
  2. /dev/null: ouptputs standard logs and error logs

  3. /dev/null 2>&1: outputs nothing

This is problematic when you run etcdctl with cron and you want to receive a mail from the cronjob only when an error occured.

Etcd version (please run commands below)

```console $ etcd --version etcd Version: 3.5.7 Git SHA: 215b53cf3 Go Version: go1.17.13 Go OS/Arch: linux/amd64 $ etcdctl version etcdctl version: 3.5.7 API version: 3.5 ```

Etcd configuration (command line flags or environment variables)

# paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

```console $ etcdctl member list -w table # paste output here $ etcdctl --endpoints= endpoint status -w table # paste output here ```

Relevant log output

No response

tjungblu commented 6 months ago

This is problematic when you run etcdctl with cron and you want to receive a mail from the cronjob only when an error occured.

I agree with you on the stderr aspect. If you're interested in some workaround to make the admin live easier, etcdutl snapshot status <file of previously taken snapshot> adheres to that better.

Just picking some random invalid file:

$ etcdutl snapshot status 5ac.snap >/dev/null Error: invalid database $ echo $? 130

It will also not output any info logs on success:

$ etcdutl snapshot status ddump/member/snap/db >/dev/null $ echo $? 0

That's what we do in our backup scripts in OpenShift at least, take a snapshot, then gate the success on the status output.

siyuanfoundation commented 4 months ago

cc @moficodes

ivanvc commented 4 months ago

17510 solves the redirection issue for this specific command. However, after reviewing that PR, IMO solving this on a per-command basis doesn't sound clever. Do we want to redirect <= info level logs to stdout and > info level logs to stderr for all of etcdctl commands?

ivanvc commented 4 months ago

Now, #17510 is aiming to address the issue for all of the etcdctl commands. I suggest reaching a conclusion on what we want to do here before we move forward with the PR.