splunk / splunk-connect-for-kubernetes

Helm charts associated with kubernetes plug-ins
Apache License 2.0
344 stars 270 forks source link

CPU and memory Usage - Requirements and Guidelines #481

Closed wrahmann closed 2 years ago

wrahmann commented 4 years ago

Hi,

For Splunkconnect , the defaults values for CPU and memory allocation is as follows:

resources:

limits:

#  cpu: 100m
#  memory: 200Mi
requests:
  cpu: 100m
  memory: 200Mi

However, it does not provide any detail when should we change these values? What is the logging throughput that can be supported by these values?

Also please let me know how buffer settings are impacted if we change the cpu and memory?

buffer: "@type": memory total_limit_size: 600m chunk_limit_size: 20m chunk_limit_records: 100000 flush_interval: 5s flush_thread_count: 1 overflow_action: block retry_max_times: 5 retry_type: periodic

Wajih

dengalebr commented 4 years ago

We are also seeing high CPU usage in our environment but never crosses beyond 1. I read its fluentd which can not go beyond 1 CPU. Our consumption always around 1 CPU only during peak with below settings, whereas there is no limits configured and can consume as much as required.

resources: requests: cpu: 1 memory: 1

We do see issues when 1 CPU is consumed where events are delayed for hours. We wanted to check if we can somehow provide more resources where fluentd can consume and expedite processing. Is there any way we can configure multi-process? (I read tail do not support it but do we have any sample for SCK where it can consume more CPU?)

TonyBrookILSTU commented 2 years ago

I'm wondering the same thing. We're firing up SCK on our test environment for the first time, and immediately it jumped up higher than any other project we have on the cluster. We're using a combination of the default values.yaml and the OCP4 example here: https://github.com/splunk/splunk-connect-for-kubernetes/blob/develop/helm-chart/splunk-connect-for-kubernetes/examples/openshift4-logging-only.yaml

image

image

TonyBrookILSTU commented 2 years ago

Here is our values.yaml in its messy glory.

VERY raw values.yaml file inside... ```values.yaml # Splunk Connect for Kubernetes is a umbraller chart for three charts # * splunk-kubernetes-logging # * splunk-kubernetes-objects # * splunk-kubernetes-metrics # Use global configurations for shared configurations between sub-charts. # Supported global configurations: # Values defined here are the default values. global: logLevel: info splunk: hec: # host is required and should be provided by user host: # port to HEC, optional, default 8088 port: # token is required and should be provided by user token: # protocol has two options: "http" and "https", default is "https" # For self signed certficate leave this field blank protocol: # indexName tells which index to use, this is optional. If it's not present, will use "main". indexName: test_ocp_logs # insecureSSL is a boolean, it indicates should it allow insecure SSL connection (when protocol is "https"). Default is false. # For a self signed certficate this value should be true insecureSSL: true # The PEM-format CA certificate for this client. # NOTE: The content of the certificate itself should be used here, not the file path. # The certificate will be stored as a secret in kubernetes. clientCert: # The private key for this client. # NOTE: The content of the key itself should be used here, not the file path. # The key will be stored as a secret in kubernetes. clientKey: # The PEM-format CA certificate file. # NOTE: The content of the file itself should be used here, not the file path. # The file will be stored as a secret in kubernetes. # For self signed certficate leave this field blank caFile: # For object and metrics indexRouting: kubernetes: # The cluster name used to tag logs. Default is cluster_name clusterName: prometheus_enabled: monitoring_agent_enabled: monitoring_agent_index_name: # deploy a ServiceMonitor object for usage of the PrometheusOperator serviceMonitor: enabled: false metricsPort: 24231 interval: "" scrapeTimeout: "10s" additionalLabels: { } ## Enabling splunk-kubernetes-logging will install the `splunk-kubernetes-logging` chart to a kubernetes ## cluster to collect logs generated in the cluster to a Splunk indexer/indexer cluster. splunk-kubernetes-logging: enabled: true # logLevel is to set log level of the Splunk log collector. Avaiable values are: # * trace # * debug # * info (default) # * warn # * error logLevel: # This is can be used to exclude verbose logs including various system and Helm/Tiller related logs. fluentd: # path of logfiles, default /var/log/containers/*.log path: /var/log/containers/*.log # paths of logfiles to exclude. object type is array as per fluentd specification: # https://docs.fluentd.org/input/tail#exclude_path exclude_path: # - /var/log/containers/kube-svc-redirect*.log # - /var/log/containers/tiller*.log # - /var/log/containers/*_kube-system_*.log (to exclude `kube-system` namespace) # Configurations for container logs containers: # Path to root directory of container logs path: /var/log # Final volume destination of container log symlinks pathDest: /var/lib/docker/containers # Log format type, "json" or "cri" logFormatType: json # Specify the logFormat for "cri" logFormatType - provide time format # For example "%Y-%m-%dT%H:%M:%S.%N%:z" for openshift, "%Y-%m-%dT%H:%M:%S.%NZ" for IBM IKS # Default for "cri": "%Y-%m-%dT%H:%M:%S.%N%:z" # For "json", the log format cannot be changed: "%Y-%m-%dT%H:%M:%S.%NZ" logFormat: # Specify the interval of refreshing the list of watch file. refreshInterval: # Enriches log record with kubernetes data k8sMetadata: # Pod labels to collect podLabels: - app - k8s-app - release watch: true cache_ttl: 3600 sourcetypePrefix: "kube" rbac: # Specifies whether RBAC resources should be created. # This should be set to `false` if either: # a) RBAC is not enabled in the cluster, or # b) you want to create RBAC resources by yourself. create: true # If you are on OpenShift and you want to run the a privileged pod # you need to have a ClusterRoleBinding for the system:openshift:scc:privileged # ClusterRole. Set to `true` to create the ClusterRoleBinding resource # for the ServiceAccount. openshiftPrivilegedSccBinding: true serviceAccount: # Specifies whether a ServiceAccount should be created create: true # The name of the ServiceAccount to use. # If not set and create is true, a name is generated using the fullname template name: splunkconnect podSecurityPolicy: # Specifies whether Pod Security Policy resources should be created. # This should be set to `false` if either: # a) Pod Security Policies is not enabled in the cluster, or # b) you want to create Pod Security Policy resources by yourself. create: false # Specifies whether AppArmor profile should be applied. # if set to true, this will add two annotations to PodSecurityPolicy: # apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default' # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' # set to false if AppArmor is not available apparmor_security: true # apiGroup can be set to "extensions" for Kubernetes < 1.10. # apiGroup: policy # Local splunk configurations splunk: # Configurations for HEC (HTTP Event Collector) hec: # host is required and should be provided by user host: # port to HEC, optional, default 8088 port: # token is required and should be provided by user token: # protocol has two options: "http" and "https", default is "https" # For self signed certficate leave this field blank protocol: # indexName tells which index to use, this is optional. If it's not present, will use "main". indexName: # insecureSSL is a boolean, it indicates should it allow insecure SSL connection (when protocol is "https"). Default is false. # For a self signed certficate this value should be true insecureSSL: # The PEM-format CA certificate for this client. # NOTE: The content of the certificate itself should be used here, not the file path. # The certificate will be stored as a secret in kubernetes. clientCert: # The private key for this client. # NOTE: The content of the key itself should be used here, not the file path. # The key will be stored as a secret in kubernetes. clientKey: # The PEM-format CA certificate file. # NOTE: The content of the file itself should be used here, not the file path. # The file will be stored as a secret in kubernetes. caFile: # Configurations for Ingest API # ingest_api: # # serviceClientIdentifier is a string, the client identifier is used to make requests to the ingest API with authorization. # serviceClientIdentifier: # # serviceClientSecretKey is a string, the client identifier is used to make requests to the ingest API with authorization. # serviceClientSecretKey: # # tokenEndpoint is a string, it indicates which endpoint should be used to get the authorization token used to make requests to the ingest API. # tokenEndpoint: # # ingestAuthHost is a string, it indicates which url/hostname should be used to make token auth requests to the ingest API. # ingestAuthHost: # # ingestAPIHost is a string, it indicates which url/hostname should be used to make requests to the ingest API. # ingestAPIHost: # # tenant is a string, it indicates which tenant should be used to make requests to the ingest API. # tenant: # # eventsEndpoint is a string, it indicates which endpoint should be used to make requests to the ingest API. # eventsEndpoint: # # debugIngestAPI is a boolean, it indicates whether user wants to debug requests and responses to ingest API. Default is false. # debugIngestAPI: # # Create or use existing secret if name is empty default name is used # secret: # create: true # name: # Directory where to read journald logs. journalLogPath: /run/log/journal # Set to true, to change the encoding of all strings to utf-8. # # By default fluentd uses ASCII-8BIT encoding. If you have 2-byte chars in your logs # you need to set the encoding to UTF-8 instead. # charEncodingUtf8: false # `logs` defines the source of logs, multiline support, and their sourcetypes. # # The scheme to define a log is: # # ``` # : # from: # # timestampExtraction: # regexp: "" # format: "" # multiline: # firstline: "" # flushInterval 5 # sourcetype: "" # ``` # # = = # It supports 3 kinds of sources: journald, file, and container. # For `journald` logs, `unit` is required for filtering using _SYSTEMD_UNIT, example: # ``` # docker: # from: # journald: # unit: docker.service # ``` # # For `file` logs, `path` is required for specifying where is the log files. Log files are expected in `/var/log`, example: # ``` # docker: # from: # file: # path: /var/log/docker.log # ``` # # For `container` logs, pod name is required. You can also provide the container name, if it's not provided, the name of this source will be used as the container name: # ``` # kube-apiserver: # from: # pod: kube-apiserver # # etcd: # from: # pod: etcd-server # container: etcd-container # ``` # # = timestamp = # `timestampExtraction` defines how to extract timestamp from logs. This *only* works for `file` source. # To use `timestampExtraction` you need to define both: # - `regexp`: the Regular Expression used to find the timestamp from a log entry. # The timestamp part must be in a `time` named group. E.g. # (?
TonyBrookILSTU commented 2 years ago

Ah, we found a possible fix. It appears that using JSON with OpenShift is a bad idea(tm), and causes all sorts of noise. When we switched the logging to CRI, it started working as we expected.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 years ago

This issue was closed because it has been inactive for 14 days since being marked as stale.