vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.42k stars 1.51k forks source link

Vector not releasing memory #10372

Open sam-mcbr opened 2 years ago

sam-mcbr commented 2 years ago

Community Note

Vector Version

0.18.0

Vector Configuration File

vector:
          role: Stateless-Aggregator
          fullnameOverride: {{ .Values.vectorApplicationName}}
          image:
            repository: <repo>
            tag: 0.18.1-debian
          replicas: {{ int .Values.vectorNumReplicas }}
          resources: {{- toYaml .Values.vectorResources | nindent 12 }}
          serviceAccount:
            create: true
            annotations:
# ...
            automountToken: true
          securityContext:
            runAsUser: 1000
            runAsGroup: 3000
          tolerations: []
          podMonitor:
            enabled: true
            port: prometheus
            additionalLabels:
# ...
            metricRelabelings:
              {{- toYaml .Values.vectorMetricRelabelings | nindent 14 }}
          autoscaling:
            enabled: true
            minReplicas: {{ int .Values.vectorNumReplicas }}
            maxReplicas: 15
            targetCPUUtilizationPercentage: 60
            targetMemoryUtilizationPercentage: 70
          customConfig:
            data_dir: /vector-data-dir
            api:
              enabled: true
              address: 0.0.0.0:8686
              playground: false
            sources:
              internal_metrics:
                type: internal_metrics
              {{- toYaml .Values.vectorSources | nindent 14 }}
            transforms:
# ...
            sinks:
              prometheus:
                type: prometheus
                inputs: [internal_metrics]
                address: 0.0.0.0:9090
              loki:
                type: loki
                inputs: [<from_our_transform>]
                endpoint: <endpoint>
                encoding:
                  codec: json
                batch:
                  max_bytes: 400000
                out_of_order_action: rewrite_timestamp
                labels:
# ...
                remove_label_fields: true

Expected Behavior

We have a metric that tracks events in - events out. We had a larger increase in events in, which lead to an increase in our events in memory metric:

This caused an increase in the memory used:

We expected the memory footprint to decrease after the events had finished processing.

Actual Behavior

As seen in the above graph, the memory footprint continued to be larger than expected till we restarted the deployment.

Additional Context

Discord discussion.

jszwedko commented 1 year ago

@sam-mcbr in 0.19.0 we switched the allocator to use jemalloc which could impact the behavior here. Would you mind retrying with 0.19.0 or higher? Apologies for the long delay in response here.