Closed lukas-vlcek closed 3 years ago
/retest
/test cluster-logging-operator-e2e
/test cluster-logging-operator-e2e
/test cluster-logging-operator-e2e
@lukas-vlcek do we even need to keep 16? Can we make this even lower? What do we gain from keeping so many historical logs vs just the most recent 8 or even 4?
/retest
/test cluster-logging-operator-e2e
/test elastic-operator-e2e
/test cluster-logging-operator-e2e
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: jcantrill, lukas-vlcek
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Description
As of writing we are using
-Xlog:gc
option to tell JVM to produce and manage logs containing detailed info about JVM GCs. https://github.com/openshift/origin-aggregated-logging/blob/c4a31147e4af9883d2ad749f4060a3c70e641816/elasticsearch/run.sh#L95There is a question if we really need these files and if it is ok that they can take space from PVC which is primarily used by index data.
First, the gc.log files (and it's rotations) are not manager by log4j2 but directly by JVM. See JVM Unified Logging Framework https://openjdk.java.net/jeps/158
Second, according to some resources GC logging (-Xlog) has very low performance impact. See https://dzone.com/articles/enabling-and-analysing-the-garbage-collection-log
Third, in our case the total space taken by gc.log files will actually grow only up to 2GB before the logs start rotating. This is not a large amount of data given that some other log files (this time managed by log4j2) can grow larger than that (for example the ES deprecation log).
If there is anything questionable then it is not the existing size of gc.log files but rather the usefulness of them. When Elasticsearch JVM is running heavy and expensive GC cycles then it will be actually logged into ES log files itself anyway (see https://discuss.elastic.co/t/change-gc-log-thresholds/155132 for relevant discussion and how it can be tuned). I assume that the JVM managed gc logs are mostly useful to Elasticsearch developers because they can help uncover specific memory leaks or troubleshoot other memory issues. But IMO they add little value when supporting specific customer cases (again, if there are heavy GCs running we will still see them in regular es.log files).
I am considering to either turn off
-Xlog:gc
config at all (due to reasons explained above) or decrease the number of the gc.log files to half. Down to 16 from 32. Saving about 1GB of disk space.Right now I incline to do the later (keep only 16 recent gc logs).
/cc @jcantrill /assign @ewolinetz
Note: I do not think we actually need to consider back-porting this to earlier releases.
Links