Closed chanzxxx closed 2 months ago
@iam-veeramalla There are some frustrating issues in this project.
The internal grpc interceptor seems to have to rely on the logrus implementation. Many functions utilize logrus' global logger for log output capture (which is almost impossible to dynamically modify the output of the global logger in zap). Some test cases rely on logrus' log detection and test modules.
I can't quite understand why some source code needs to be designed this way. These modules may need to be rewritten and replaced by dependent third-party library implementations. https://github.com/argoproj/argo-cd/blob/309654cece2292fb09959055e8602b9b6dc68077/util/grpc/logging.go#L73-L93
The internal grpc interceptor seems to have to rely on the logrus implementation. Many functions utilize logrus' global logger for log output capture (which is almost impossible to dynamically modify the output of the global logger in zap). Some test cases rely on logrus' log detection and test modules.
@edoger We are aware and discussed this in the contributors when @iam-veeramalla presented this issue. This will require a huge refactor in Argo CD. We will need help coordinating all this work. I believe that it is better to map every package that needs to be migrated and provide smaller PRs. This way we will be able to gradually migrate to the new log approach and refactor everything.
@iam-veeramalla would you be willing to take this governance task to map and track all work that needs to be done in order to get this 100% implemented?
The internal grpc interceptor seems to have to rely on the logrus implementation. Many functions utilize logrus' global logger for log output capture (which is almost impossible to dynamically modify the output of the global logger in zap). Some test cases rely on logrus' log detection and test modules.
@edoger We are aware and discussed this in the contributors when @iam-veeramalla presented this issue. This will require a huge refactor in Argo CD. We will need help coordinating all this work. I believe that it is better to map every package that needs to be migrated and provide smaller PRs. This way we will be able to gradually migrate to the new log approach and refactor everything.
@iam-veeramalla would you be willing to take this governance task to map and track all work that needs to be done in order to get this 100% implemented?
I can do that but I will need to get on a call to understand the actual efforts and how we can get it done together 🙂.
How did it go with this issue ? Did you get started on it? I currently get 450k log lines an hour on stderr which is a bit problematic.
Any update on this? We are looking at hundreds of thousands of log entries per hour in GCP. This costs us and our customers more and more resources and money.
Any update on this,please. We are in the same boat.
Did anybody found a workaround ? We can exclude them from cloud logging, but this is quite annoying
Quick note that the proposed go-native structured, leveled logging library is available as an experimental package: https://pkg.go.dev/golang.org/x/exp/slog
The change of complete refactoring is very huge, (╯﹏╰)b, I suggest forking logrus and adding LevelOutput function (I once submitted a function patch, but the author seems not very interested in new functions), and then in go mod replace the package.
The change of complete refactoring is very huge, (╯﹏╰)b, I suggest forking logrus and adding LevelOutput function (I once submitted a function patch, but the author seems not very interested in new functions), and then in go mod replace the package.
Apparently not...
Is there any chance this logging issue could be resolved?
@edoger I'm not completely opposed to the idea of a fork. Fully replacing the logger is a huge undertaking, and I don't think anyone is currently doing it. If logrus isn't accepting new features, then maintaining a fork should be relatively low-effort.
Unless someone wants to take on the full-replacement task. :-)
Just wanted to point out that in Argo CD Image Updater, we use vanilla logrus with being able to log anything with a level less severe than error to stdout, and error and fatal severities to stderr: https://github.com/argoproj-labs/argocd-image-updater/blob/master/pkg/log/log.go
So, it's possible, even if there is some slight overhead through a wrapper.
Seems like a reasonable opt-in approach, for folks who aren't worried about a little CPU overhead. Maybe even fine as a default, idk how expensive the wrapper is.
I haven't done any performance tests, but the major part of the overhead would be calling two functions instead of one. There's not much compute involved in that, I guess.
If it would be acceptable to just log everything (including levels error and fatal) to stdout, it's just a matter of calling logrus' SetOutput()
function before starting to log.
If it would be acceptable to just log everything (including levels error and fatal) to stdout, it's just a matter of calling logrus'
SetOutput()
function before starting to log.
I think this is OK for JSON logging. Theoretically , if you're doing structured logging, you can just filter by the value of the level
key.
Should definitely be fine as an opt-in.
We still have some external dependencies doing their own logging, such as argoproj/pkg
and potentially others.
The go standard library package log/slog
has been released as part of go 1.21. Is this the way to go to fix this issue ?
If anyone is interested in a workaround in the meantime (thanks to @smaftoul 🎉 ), we configured argocd to log as json with the following config :
- applicationsetcontroller.log.format=json
- controller.log.format=json
- notificationscontroller.log.format=json
- reposerver.log.format=json
- server.log.format=json
That way logs are correctly parsed by GCP even if going to stderr.
Getting back here to point out that logs should indeed keep going to stderr
. There are many log libraries that do that by default (zap, logrus, etc) for a reason. The reason is because stdout
must only contain the output from the executed command. For example, if we run a command like argocd app list
we should only get the list of applications and not getting it mixed with logs. Doing so, would break scripts that depend on Argo CD CLI with no easy path to fix.
Even if it is a bit counter intuitive, diagnostics output like logs do belong to stderr
.
See: https://www.gnu.org/software/libc/manual/html_node/Standard-Streams.html
Summary
Normal logs should not go to stderr.
Motivation
Some people including me detect critical issues on k8s by watching pods' stderr logs. But argocd applications are printing their all logs to stderr, and I think this is confused.
Proposal
Basically we should respect the purpose of each stream. Errors go stderr, and normal activity logs go stdout.