SumoLogic / sumologic-kubernetes-collection

Sumo Logic collection solution for Kubernetes
Apache License 2.0
147 stars 184 forks source link

Send Kubernetes OS logs to Sumologic #3163

Closed captainfalcon23 closed 6 months ago

captainfalcon23 commented 1 year ago

I am trying to send Kubernetes OS logs (e.g. /var/log/messages) to Sumologic. I think I am almost there with my config, but I still can't seem to get logs into Sumo. Can anyone assist and/or link to the relevant documenation? I based what I did off of https://help.sumologic.com/docs/send-data/opentelemetry-collector/data-source-configurations/collect-logs/

I can see the following log in the otelcol-logs-collector, so I assume the directory is mounted in correctly (I can't even exec into the container as it seems to have no shell at all): 2023-07-25T08:41:05.275Z info fileconsumer/file.go:192 Started watching file {"kind": "receiver", "name": "filelog/custom_files", "data_type": "logs", "component": "fileconsumer", "path": "/var/log/messages"}

I can't use/create an exporter with my own name, as it throws some error, and using "sumologic" results in some other error.

In my values-override.yml:

otellogs:
  daemonset:
    extraVolumes:
      - hostPath:
          path: /var/log
          type: ""
        name: varlog

    extraVolumeMounts:
      - name: varlog
        mountPath: /var/log
        readOnly: true
  config:
    merge:
      receivers:
        # https://help.sumologic.com/docs/send-data/opentelemetry-collector/data-source-configurations/collect-logs/#collecting-logs-from-local-files
        filelog/custom_files:
          include:
            - /var/log/messages
            - /var/log/audit/audit.log
          include_file_name: true
          include_file_path_resolved: true
          storage: file_storage

      processors:
        groupbyattrs/custom_files:
          keys:
            - log.file.path_resolved
        resource/custom_files:
          attributes:
            - key: _sourceCategory
              value: linux/system
              action: insert
        memory_limiter:
        # ref: https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/main/deploy/helm/sumologic/values.yaml
          check_interval: 5s
          limit_percentage: 10
          ## Maximum spike expected between the measurements of memory usage, in %.
          spike_limit_percentage: 5

        resourcedetection/system:
          detectors: [system]
          override: false
          timeout: 10s

      # exporters:
      #   sumologic:
      #     endpoint: ${SUMO_ENDPOINT_DEFAULT_LOGS_SOURCE}
      #     json_logs:
      #       add_timestamp: false
      #     log_format: json
      #     sending_queue:
      #       enabled: true
      #       num_consumers: 10
      #       queue_size: 10000
      #       storage: file_storage

      service:
        pipelines:
          logs/custom_files:
            receivers:
              - filelog/custom_files
            processors:
              # - memory_limiter
              - groupbyattrs/custom_files
              - resource/custom_files
              - resourcedetection/system
              - batch
            exporters:
              - otlphttp
sumo-drosiek commented 1 year ago

I can't use/create an exporter with my own name, as it throws some error, and using "sumologic" results in some other error.

what errors?

You may add logging exporter to see if logs are correctly read from the node

captainfalcon23 commented 1 year ago

@sumo-drosiek So, following the guidance from https://help.sumologic.com/docs/send-data/opentelemetry-collector/data-source-configurations/collect-logs/#collecting-logs-from-local-files

This config:

otellogs:
  config:
    merge:
      service:
        pipelines:
          logs/custom_files:
            receivers:
              - filelog/custom_files
            processors:
              # - memory_limiter
              - groupbyattrs/custom_files
              - resource/custom_files
              - resourcedetection/system
              - batch
            exporters:
              - sumologic

Results in:

Defaulted container "otelcol" out of: otelcol, changeowner (init)
Error: invalid configuration: service::pipeline::logs/custom_files: references exporter "sumologic" which is not configured
2023/07/25 23:54:03 collector server run finished with error: invalid configuration: service::pipeline::logs/custom_files: references exporter "sumologic" which is not configured

This makes no sense to me as it IS defined in the default config? Otherwise nothing would work?

If I try defining "Sumologic" like this (which is a copy/paste from the existing config):

otellogs:
  config:
    merge:
      exporters:
        sumologic:
          endpoint: ${SUMO_ENDPOINT_DEFAULT_LOGS_SOURCE}
          json_logs:
            add_timestamp: false
          log_format: json
          sending_queue:
            enabled: true
            num_consumers: 10
            queue_size: 10000
            storage: file_storage

      service:
        pipelines:
          logs/custom_files:
            receivers:
              - filelog/custom_files
            processors:
              # - memory_limiter
              - groupbyattrs/custom_files
              - resource/custom_files
              - resourcedetection/system
              - batch
            exporters:
              - sumologic

I get the below and a crashed pod:

Defaulted container "otelcol" out of: otelcol, changeowner (init)
2023-07-25T23:57:51.799Z        info    service/telemetry.go:104        Setting up own telemetry...
2023-07-25T23:57:51.799Z        info    service/telemetry.go:127        Serving Prometheus metrics      {"address": ":8888", "level": "Basic"}
2023-07-25T23:57:51.799Z        info    processor/processor.go:289      Development component. May change in the future.        {"kind": "processor", "name": "logstransform/systemd", "pipeline": "logs/systemd"}
2023-07-25T23:57:51.800Z        info    sumologicexporter@v0.0.0-00010101000000-000000000000/exporter.go:92     Sumo Logic Exporter configured  {"kind": "exporter", "data_type": "logs", "name": "sumologic", "log_format": "json", "metric_format": "otlp", "trace_format": "otlp"}
2023-07-25T23:57:51.801Z        info    service/service.go:131  Starting otelcol-sumo...        {"Version": "v0.79.0-sumo-0", "NumCPU": 16}
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:30     Starting extensions...
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:33     Extension is starting...        {"kind": "extension", "name": "health_check"}
2023-07-25T23:57:51.801Z        info    healthcheckextension@v0.79.0/healthcheckextension.go:34 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2023-07-25T23:57:51.801Z        warn    internal/warning.go:40  Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks        {"kind": "extension", "name": "health_check", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:37     Extension started.      {"kind": "extension", "name": "health_check"}
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:33     Extension is starting...        {"kind": "extension", "name": "file_storage"}
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:37     Extension started.      {"kind": "extension", "name": "file_storage"}
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:33     Extension is starting...        {"kind": "extension", "name": "pprof"}
2023-07-25T23:57:51.801Z        info    pprofextension@v0.79.0/pprofextension.go:60     Starting net/http/pprof server  {"kind": "extension", "name": "pprof", "config": {"TCPAddr":{"Endpoint":"localhost:1777"},"BlockProfileFraction":0,"MutexProfileFraction":0,"SaveToFile":""}}
2023-07-25T23:57:51.801Z        info    extensions/extensions.go:37     Extension started.      {"kind": "extension", "name": "pprof"}
2023-07-25T23:57:51.802Z        info    service/service.go:157  Starting shutdown...
2023-07-25T23:57:51.802Z        info    healthcheck/handler.go:129      Health Check state change       {"kind": "extension", "name": "health_check", "status": "unavailable"}
2023-07-25T23:57:51.802Z        info    logstransformprocessor@v0.79.0/processor.go:66  Stopping logs transform processor       {"kind": "processor", "name": "logstransform/systemd", "pipeline": "logs/systemd"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2a8d774]

goroutine 1 [running]:
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter.(*Converter).Stop(0xc0011e93e0?)
        github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.79.0/adapter/converter.go:121 +0x14
github.com/open-telemetry/opentelemetry-collector-contrib/processor/logstransformprocessor.(*logsTransformProcessor).Shutdown(0xc0011d2780, {0x74b2c01?, 0xc0011d24e0?})
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/logstransformprocessor@v0.79.0/processor.go:68 +0x65
go.opentelemetry.io/collector/service/internal/graph.(*Graph).ShutdownAll(0x0?, {0x74b2ca0, 0xc000126000})
        go.opentelemetry.io/collector@v0.79.0/service/internal/graph/graph.go:308 +0xc9
go.opentelemetry.io/collector/service.(*Service).Shutdown(0xc00061aa00, {0x74b2ca0, 0xc000126000})
        go.opentelemetry.io/collector@v0.79.0/service/service.go:163 +0xd4
go.opentelemetry.io/collector/otelcol.(*Collector).setupConfigurationComponents(0xc00099ca80, {0x74b2ca0, 0xc000126000})
        go.opentelemetry.io/collector@v0.79.0/otelcol/collector.go:174 +0x60a
go.opentelemetry.io/collector/otelcol.(*Collector).Run(0xc00099ca80, {0x74b2ca0, 0xc000126000})
        go.opentelemetry.io/collector@v0.79.0/otelcol/collector.go:198 +0x65
go.opentelemetry.io/collector/otelcol.NewCommand.func1(0xc000624300, {0x6800b31?, 0x1?, 0x1?})
        go.opentelemetry.io/collector@v0.79.0/otelcol/command.go:27 +0x96
github.com/spf13/cobra.(*Command).execute(0xc000624300, {0xc000136010, 0x1, 0x1})
        github.com/spf13/cobra@v1.7.0/command.go:940 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc000624300)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).Execute(0xc000b995f0?)
        github.com/spf13/cobra@v1.7.0/command.go:992 +0x19
main.runInteractive({{0xc000b995f0, 0xc0015a67b0, 0xc000b99950, 0xc000b99260, 0xc0015a6870}, {{0x681c2b9, 0xc}, {0x69497fd, 0x2f}, {0x682b657, ...}}, ...})
        github.com/SumoLogic/sumologic-otel-collector/main.go:36 +0x65
main.run(...)
        github.com/SumoLogic/sumologic-otel-collector/main_others.go:11
main.main()
        github.com/SumoLogic/sumologic-otel-collector/main.go:24 +0x1d9

If I use a custom name like:

otellogs:
  config:
    merge:
      exporters:
        mysumologicexporter:
          endpoint: ${SUMO_ENDPOINT_DEFAULT_LOGS_SOURCE}
          json_logs:
            add_timestamp: false
          log_format: json
          sending_queue:
            enabled: true
            num_consumers: 10
            queue_size: 10000
            storage: file_storage

      service:
        pipelines:
          logs/custom_files:
            receivers:
              - filelog/custom_files
            processors:
              # - memory_limiter
              - groupbyattrs/custom_files
              - resource/custom_files
              - resourcedetection/system
              - batch
            exporters:
              - mysumologicexporter

I get:

Defaulted container "otelcol" out of: otelcol, changeowner (init)
Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'exporters': unknown type: "mysumologicexporter" for id: "mysumologicexporter" (valid values: [file kafka sumologic syslog logging otlp otlphttp awss3 carbon loadbalancing prometheus])
2023/07/25 23:59:33 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'exporters': unknown type: "mysumologicexporter" for id: "mysumologicexporter" (valid values: [file kafka sumologic syslog logging otlp otlphttp awss3 carbon loadbalancing prometheus])

Last one makes sense, but others I can't figure out.

I have also added the logging exporter as below, and used logger command to write some logs to /var/log/messages but I can't see any reference to these logs in any of the pod logs:

      exporters:
        logging:
          sampling_initial: 500
          sampling_thereafter: 1

      service:
        pipelines:
          logs/custom_files:
            receivers:
              - filelog/custom_files
            processors:
              # - memory_limiter
              - groupbyattrs/custom_files
              - resource/custom_files
              - resourcedetection/system
              - batch
            exporters:
              - logging
sumo-drosiek commented 1 year ago

@sumo-drosiek So, following the guidance from https://help.sumologic.com/docs/send-data/opentelemetry-collector/data-source-configurations/collect-logs/#collecting-logs-from-local-files

This documentation is not related to kubernetes collection. We are not using standalone collector, and architecture is described here: https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/e761743666a5a5cbd4dd2d10c87df771d7df0ef3/docs/README.md#solution-overview

In order to get custom files from node, please see the following documentation: https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/e761743666a5a5cbd4dd2d10c87df771d7df0ef3/docs/best-practices.md#collect-logs-from-additional-files-on-the-node

captainfalcon23 commented 1 year ago

thanks @sumo-drosiek . Followed the link and successfully got custom logs into Sumo. Note I ran into issue as described here - https://github.com/elastic/apm-server/issues/9036. I had to remove compression and it worked.

However, I still can't figure out how to add additional enrichment, such as the hostname where the log came from, or the sourcename, or the cluster etc. Could you please direct me to the documentation or example config?

sumo-drosiek commented 1 year ago

If hostname is not part of the log name, you will probably have to use env variables.

In order to add custom fields, you may use operators section: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/572dbbaf1287492e5f3de535613364bd570ec184/receiver/filelogreceiver#operators

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/README.md

You can also use transformprocessor

Please see the following examples:

https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/7c1008b7e8c68f3d27c1b7dd40cbe1b92adf8825/deploy/helm/sumologic/conf/logs/collector/otelcol/config.yaml#L200-L203

https://github.com/SumoLogic/sumologic-kubernetes-collection/blob/7c1008b7e8c68f3d27c1b7dd40cbe1b92adf8825/docs/collecting-container-logs.md#adding-custom-fields

note that for last example, you cannot use extraProcessors, and you need to modify OpenTelemetry configuration directly (using merge)

captainfalcon23 commented 1 year ago

Some log files contain the hostname (for example) and others don't. Isn't there a simple way to get the hostname of the collector which sent the log to the enricher? Since I am completely new to open-telementry, I am a bit confused even where to add some of the configuration.

Is there any configuration creator tool I might be able to reference which will build me a working config?

sumo-drosiek commented 1 year ago

Is there any configuration creator tool I might be able to reference which will build me a working config?

Not aware of any. You may try to run collection inside vagrant and play with configuration

Isn't there a simple way to get the hostname of the collector which sent the log to the enricher?

It sounds like pretty common usecase and I am going to evaluate it and add to documentation, but I cannot provide ETA for that :/

captainfalcon23 commented 1 year ago

Ok I have gotten this working using the below:

## Configure log collection with Otelcol
otellogs:
  config:
    merge:
      receivers:
        filelog/extrafiles:
          include: 
            - /var/log/messages
            - /var/log/audit/audit.log
            - /var/log/secure
            - /var/log/cron
          include_file_path_resolved: true
          operators:
            - type: add
              field: attributes.host
              value: 'EXPR(env("MY_NODE_NAME"))'
      exporters:
        otlphttp/extrafiles:
          endpoint: http://${LOGS_METADATA_SVC}.${NAMESPACE}.svc.cluster.local.:4319
          compression: none
      service:
        pipelines:
          logs/extrafiles:
            receivers: [filelog/extrafiles]
            exporters: [otlphttp/extrafiles]
  daemonset:
    extraEnvVars:
    - name: MY_NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: spec.nodeName

It would be great if out of the box we could quickly append all the same meta data we are adding for everything else :) But at least I have a way to add attributes to the logs.