Closed vincebrannon closed 1 year ago
Hi @vincebrannon, is there a JIRA ticket assigned to this issue as well? Also, if you could elaborate on the user's use case specifics leading up to the probe failure, such what workflows they are running, how many logs are being emitted per second, etc.; this would help us to better give recommendations.
Cannot find the references that lead to this issue being logged anymore. This was from an old tracking system that has since been deleted and was logged by another person. Closing.
Let me add Jira reference here, although this is already closed: SURE-4294 It would be great to get the docu improved as it seems to to have quite an impact with the default values (OOM, out of disk..)
@jtravee Reopened this issue.. found the original case 00302119 : https://suse.lightning.force.com/lightning/r/Case/5001i00000f2YL3AAM/view Also as per @ravarga SURE-4294 relates to this as well.
This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 90 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 30 days. Thank you for your contributions.
@vincebrannon I know this is an old issue, but I recently joined the team and saw the Jira in our backlog.
Since I can't view the linked Salesforce page, could you add the information here in a comment?
Thanks!
Page this concerns: https://ranchermanager.docs.rancher.com/pages-for-subheaders/logging
@martyav Well the information is spread over many entries in the ticket and will need to be curated if you don't have access that will be hard.
But here you go this is fir the recent logging v2.
Documentation on Logging v2 in Rancher v2.5 can be found at https://rancher.com/docs/rancher/v2.x/en/logging/v2.5/ and details on migrating from the Logging in <=v2.4 to this can be found at https://rancher.com/docs/rancher/v2.x/en/logging/v2.5/migrating/
By default a buffer is not configured on outputs in Logging v2, which uses the Banzai Cloud Logging Operator. However, users are able to configure a buffer plugin, as well as all available buffer options, per the documentation at https://banzaicloud.com/docs/one-eye/logging-operator/configuration/plugins/outputs/buffer/ The fluentd documentation on buffering can be found at https://docs.fluentd.org/buffer
Having disabled the old Logging configuration and enabled the new Logging in v2.5, the below ClusterOutput and ClusterFlow resources show an example of an elasticsearch logging configuration, with a maximum 2GB simple file buffer (file buffers in Logging v2 are written to an empty-dir volume mounted in the fluentd container at /buffers).
The elasticsearch ClusterOutput would require some alteration to align this with your own Elasticsearch configuration. Please note also that due to minor format differences in output with the earlier logging, you will need to use a new index or wait until the index date rollover for logs to forward to Elasticsearch successfully with the new configuration.
apiVersion: logging.banzaicloud.io/v1beta1 kind: ClusterOutput metadata: name: "cluster-es" namespace: "cattle-logging-system" spec: elasticsearch: host: 10.131.132.63 port: 9200 include_tag_key: true reload_connections: false reconnect_on_error: true reload_on_failure: true scheme: http logstash_prefix: custom logstash_format: true logstash_dateformat: "%Y-%m-%d" type_name: container_log buffer: type: file total_limit_size: 2GB
apiVersion: logging.banzaicloud.io/v1beta1 kind: ClusterFlow metadata: name: "all-logs" namespace: "cattle-logging-system" spec: globalOutputRefs:
Request Summary: Improve documentation regarding the buffer size and its effect on memory usage.
Details: Current documentation on fluentd is missing sufficient information regarding the effects of memory usage and details on buffer configuration. The liveness probes fail because the memory is growing too fast and the pod is overloaded as buffer is too large.