rancher / rancher-docs

Rancher Documentation
https://ranchermanager.docs.rancher.com/
Apache License 2.0
59 stars 204 forks source link

[Rancher2] Request to improve the doc about fluentd buffer config and memory usage #45

Closed vincebrannon closed 1 year ago

vincebrannon commented 2 years ago

Request Summary: Improve documentation regarding the buffer size and its effect on memory usage.

Details: Current documentation on fluentd is missing sufficient information regarding the effects of memory usage and details on buffer configuration. The liveness probes fail because the memory is growing too fast and the pod is overloaded as buffer is too large.

jtravee commented 2 years ago

Hi @vincebrannon, is there a JIRA ticket assigned to this issue as well? Also, if you could elaborate on the user's use case specifics leading up to the probe failure, such what workflows they are running, how many logs are being emitted per second, etc.; this would help us to better give recommendations.

vincebrannon commented 2 years ago

Cannot find the references that lead to this issue being logged anymore. This was from an old tracking system that has since been deleted and was logged by another person. Closing.

ravarga commented 2 years ago

Let me add Jira reference here, although this is already closed: SURE-4294 It would be great to get the docu improved as it seems to to have quite an impact with the default values (OOM, out of disk..)

vincebrannon commented 2 years ago

@jtravee Reopened this issue.. found the original case 00302119 : https://suse.lightning.force.com/lightning/r/Case/5001i00000f2YL3AAM/view Also as per @ravarga SURE-4294 relates to this as well.

github-actions[bot] commented 1 year ago

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 90 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 30 days. Thank you for your contributions.

martyav commented 1 year ago

@vincebrannon I know this is an old issue, but I recently joined the team and saw the Jira in our backlog.

Since I can't view the linked Salesforce page, could you add the information here in a comment?

Thanks!

martyav commented 1 year ago

Page this concerns: https://ranchermanager.docs.rancher.com/pages-for-subheaders/logging

vincebrannon commented 1 year ago

@martyav Well the information is spread over many entries in the ticket and will need to be curated if you don't have access that will be hard.

But here you go this is fir the recent logging v2.

Documentation on Logging v2 in Rancher v2.5 can be found at https://rancher.com/docs/rancher/v2.x/en/logging/v2.5/ and details on migrating from the Logging in <=v2.4 to this can be found at https://rancher.com/docs/rancher/v2.x/en/logging/v2.5/migrating/

By default a buffer is not configured on outputs in Logging v2, which uses the Banzai Cloud Logging Operator. However, users are able to configure a buffer plugin, as well as all available buffer options, per the documentation at https://banzaicloud.com/docs/one-eye/logging-operator/configuration/plugins/outputs/buffer/ The fluentd documentation on buffering can be found at https://docs.fluentd.org/buffer

Having disabled the old Logging configuration and enabled the new Logging in v2.5, the below ClusterOutput and ClusterFlow resources show an example of an elasticsearch logging configuration, with a maximum 2GB simple file buffer (file buffers in Logging v2 are written to an empty-dir volume mounted in the fluentd container at /buffers).

The elasticsearch ClusterOutput would require some alteration to align this with your own Elasticsearch configuration. Please note also that due to minor format differences in output with the earlier logging, you will need to use a new index or wait until the index date rollover for logs to forward to Elasticsearch successfully with the new configuration.

apiVersion: logging.banzaicloud.io/v1beta1 kind: ClusterOutput metadata: name: "cluster-es" namespace: "cattle-logging-system" spec: elasticsearch: host: 10.131.132.63 port: 9200 include_tag_key: true reload_connections: false reconnect_on_error: true reload_on_failure: true scheme: http logstash_prefix: custom logstash_format: true logstash_dateformat: "%Y-%m-%d" type_name: container_log buffer: type: file total_limit_size: 2GB


apiVersion: logging.banzaicloud.io/v1beta1 kind: ClusterFlow metadata: name: "all-logs" namespace: "cattle-logging-system" spec: globalOutputRefs: