SumoLogic / sumologic-kubernetes-collection

Sumo Logic collection solution for Kubernetes
Apache License 2.0
147 stars 183 forks source link

How to increase-max-threads-for-collector via helm #3686

Closed mskhor closed 4 months ago

mskhor commented 5 months ago

Helm Version: "3.9.0"

Facing delay in ingestion and looking to increase threads in collector as described here https://help.sumologic.com/docs/send-data/collector-faq/#increase-max-threads-for-collector

Is this supported via helm values ?

swiatekm commented 4 months ago

That part of the documentation describes Sumo's own collector, which isn't used in this Chart. The Chart uses the OpenTelemetry Collector instead, and while there is a way to increase the sender thread count, in a Kubernetes environment it's better to scale horizontally instead. Have you considered enabling autoscaling in the Chart values.yaml instead?

For reference, the Helm Chart documentation can be found here: https://help.sumologic.com/docs/send-data/kubernetes/.

mskhor commented 4 months ago

@swiatekm-sumo collectors are running as daemonset and I believe auto scaling not applicable https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-collector/values.yaml#L498

swiatekm commented 4 months ago

To answer the question directly, for otel exporters which use the sending queue, you can set num_consumers: https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/exporterhelper.

I'm not sure this is really what you want, though. What exactly is the problem you're facing? Does it involve the log-collector DaemonSet this Chart creates?

mskhor commented 4 months ago

@swiatekm-sumo delay in ingestion is the issue I'm facing and initial thread I pasted had the same behaviour. It involves the otellogs in helm which is deployed via helm chart https://artifacthub.io/packages/helm/sumologic/sumologic/3.9.0?modal=values&path=otellogs

swiatekm commented 4 months ago

Are you confident the delay is the log collectors' fault? Indications of that would be CPU throttling, and the solution would be to increase the CPU request and limit here: https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/5d08ba7ac78186329a381ac46643a14e0bbcdfcb/deploy/helm/sumologic/values.yaml#L2058.

What does the resource usage of Pods created by the Chart look like in your cluster?

mskhor commented 4 months ago

Resource usage are within limits

NAME CPU(cores) MEMORY(bytes)
sumologic-sumologic-otelcol-events-0 1m 41Mi
sumologic-sumologic-otelcol-instrumentation-0 5m 211Mi
sumologic-sumologic-otelcol-instrumentation-1 1m 204Mi
sumologic-sumologic-otelcol-instrumentation-2 1m 247Mi
sumologic-sumologic-otelcol-logs-0 3m 194Mi
sumologic-sumologic-otelcol-logs-1 2m 194Mi
sumologic-sumologic-otelcol-logs-2 12m 260Mi
sumologic-sumologic-otelcol-logs-collector-2nt9g 17m 49Mi
sumologic-sumologic-otelcol-logs-collector-dlmd2 37m 53Mi
sumologic-sumologic-otelcol-logs-collector-jq9p9 67m 56Mi
sumologic-sumologic-otelcol-logs-collector-l9wrg 25m 49Mi
sumologic-sumologic-otelcol-logs-collector-mzpnf 18m 47Mi
sumologic-sumologic-otelcol-logs-collector-qtz9c 47m 56Mi
sumologic-sumologic-otelcol-logs-collector-rgsfv 13m 48Mi
sumologic-sumologic-otelcol-logs-collector-rjznp 27m 49Mi
sumologic-sumologic-otelcol-logs-collector-tt5xg 68m 52Mi
sumologic-sumologic-otelcol-logs-collector-zw2mf 14m 46Mi
sumologic-sumologic-otelcol-metrics-0 1m 99Mi
sumologic-sumologic-otelcol-metrics-1 2m 97Mi
sumologic-sumologic-otelcol-metrics-2 1m 97Mi
sumologic-sumologic-remote-write-proxy-cdf5cd75c-5gx5h 1m 1Mi
sumologic-sumologic-remote-write-proxy-cdf5cd75c-7fjt2 1m 2Mi
sumologic-sumologic-remote-write-proxy-cdf5cd75c-wm5dx 1m 2Mi
sumologic-sumologic-traces-gateway-79d5d6dd86-fcspp 1m 31Mi
sumologic-sumologic-traces-sampler-59bcdbc7c5-9xpfx 1m 44Mi
sumologic-tailing-sidecar-operator-746fd8c7d8-txxdd 2m 22Mi

swiatekm commented 4 months ago

It does not look like you're resource constrained, so the ingestion delay must be caused by something else. Could you open a support ticket with Sumo, provide all the necessary information, and link this issue?

mskhor commented 4 months ago

Thanks @swiatekm-sumo