open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.9k stars 2.27k forks source link

[exporter/syslog] not sending data of jornald receiver in correct format #34373

Open swetasgit opened 1 month ago

swetasgit commented 1 month ago

Component(s)

exporter/syslog

What happened?

Description

Sending to logs collected by journals receiver from host system to syslog-ng server using syslog exporter.

Steps to Reproduce

Expected Result

System journald logs are collected and sent to syslog-ng server using syslog exporter

Actual Result

No useful logs collected at syslog server

image

Collector version

0.95.0

Environment information

Environment

OS: Debian Bookworm k3s version v1.29.6+k3s2

OpenTelemetry Collector configuration

exporters:
  debug:
    sampling_initial: 5
    sampling_thereafter: 200
    verbosity: detailed  
  syslog:
    endpoint: <SYSLOG_service>
    network: tcp
    port: 601
    tls:
      insecure: true
    enable_octet_counting: true
    protocol: rfc5424
    retry_on_failure:
      enabled: true
      initial_interval: 10s
      max_elapsed_time: 150s
      max_interval: 40s
    sending_queue:
      enabled: false
      num_consumers: 20
      queue_size: 10000
    timeout: 1s
extensions:
  health_check:
    endpoint: ${env:MY_POD_IP}:13133
processors:
  attributes:
    actions:    
      - key: "ClusterName"
        value: k3s-cluster
        action: INSERT
  batch:
    send_batch_size: 1024
    timeout: 5s
  memory_limiter:
    check_interval: 5s
    limit_percentage: 80
    spike_limit_percentage: 25
receivers:
  journald:
    directory: /var/log/journal
    type: journald_input
service:
  extensions:
  - health_check
  pipelines:
    logs:
      exporters:
      - syslog      
      processors:
      - attributes
      - batch
      receivers:
      - journald
  telemetry:
    metrics:
      address: ${env:MY_POD_IP}:8888

Log output

2024-08-01T07:06:39.031Z    info    service@v0.95.0/telemetry.go:55 Setting up own telemetry...
2024-08-01T12:36:39.032001349+05:30 2024-08-01T07:06:39.031Z    info    service@v0.95.0/telemetry.go:97 Serving metrics {"address": "10.42.0.12:8888", "level": "Basic"}
2024-08-01T12:36:39.032062661+05:30 2024-08-01T07:06:39.031Z    info    syslogexporter@v0.95.0/exporter.go:42   Syslog Exporter configured  {"kind": "exporter", "data_type": "logs", "name": "syslog", "endpoint": "10.43.158.127", "protocol": "rfc5424", "port": 601}
2024-08-01T12:36:39.033486498+05:30 2024-08-01T07:06:39.033Z    info    service@v0.95.0/service.go:143  Starting otelcol-contrib... {"Version": "0.95.0", "NumCPU": 4}
2024-08-01T12:36:39.033509456+05:30 2024-08-01T07:06:39.033Z    info    extensions/extensions.go:34 Starting extensions...
2024-08-01T12:36:39.033512804+05:30 2024-08-01T07:06:39.033Z    info    extensions/extensions.go:37 Extension is starting...    {"kind": "extension", "name": "health_check"}
2024-08-01T12:36:39.033516490+05:30 2024-08-01T07:06:39.033Z    info    healthcheckextension@v0.95.0/healthcheckextension.go:35 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"10.42.0.12:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2024-08-01T12:36:39.033716425+05:30 2024-08-01T07:06:39.033Z    info    extensions/extensions.go:52 Extension started.  {"kind": "extension", "name": "health_check"}
2024-08-01T12:36:39.033725106+05:30 2024-08-01T07:06:39.033Z    info    adapter/receiver.go:45  Starting stanza receiver    {"kind": "receiver", "name": "journald", "data_type": "logs"}
2024-08-01T12:36:40.040789398+05:30 2024-08-01T07:06:40.035Z    info    healthcheck/handler.go:132  Health Check state change   {"kind": "extension", "name": "health_check", "status": "ready"}
2024-08-01T07:06:40.035Z    info    service@v0.95.0/service.go:169  Everything is ready. Begin running and processing data.
2024-08-01T07:06:40.035Z    warn    localhostgate/featuregate.go:63 The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default.   {"feature gate ID": "component.UseLocalHostAsDefaultHost"}

Additional context

No response

github-actions[bot] commented 1 month ago

Pinging code owners:

swetasgit commented 1 month ago

When I add transform, i start getting the message but strangely only SU message. I haven't added any filter at journals receiver end as when I use file exporter I get all journald logs but not in syslog server:

transform:
        log_statements:
          - context: log
            statements:
              - set(attributes["message"], body)
          - context: log

Logs on syslog:

Aug  1 14:51:25 192.168.1.8 1072 <165>1 2024-08-01T09:21:20.85347Z - - - - - {"MESSAGE":"(to root) sweta on pts/1","PRIORITY":"5","SYSLOG_FACILITY":"4","SYSLOG_IDENTIFIER":"su","SYSLOG_PID":"6816","SYSLOG_TIMESTAMP":"Aug  1 14:51:20 ","_AUDIT_LOGINUID":"1000","_AUDIT_SESSION":"5","_BOOT_ID":"a4aae1ad370f427586f59b9f11140774","_CAP_EFFECTIVE":"1ffffffffff","_CMDLINE":"su -","_COMM":"su","_EXE":"/usr/bin/su","_GID":"1000","_HOSTNAME":"debian","_MACHINE_ID":"97f0f0c26cfa4e04bbb07a29a1800ab1","_PID":"6816","_RUNTIME_SCOPE":"system","_SELINUX_CONTEXT":"unconfined\n","_SOURCE_REALTIME_TIMESTAMP":"1722504080853417","_SYSTEMD_CGROUP":"/user.slice/user-1000.slice/session-5.scope","_SYSTEMD_INVOCATION_ID":"c4af78074a364d0491d562993d2d4f50","_SYSTEMD_OWNER_UID":"1000","_SYSTEMD_SESSION":"5","_SYSTEMD_SLICE":"user-1000.slice","_SYSTEMD_UNIT":"session-5.scope","_SYSTEMD_USER_SLICE":"-.slice","_TRANSPORT":"syslog","_UID":"1000","__CURSOR":"s=5e91832a0ef440de854b15cc2fd5f3fc;i=1034068;b=a4aae1ad370f427586f59b9f11140774;m=588fb0bf;t=61e9bbbdd09de;x=13b360bea81c1800","__MONOTONIC_TIMESTAMP":"1485811903"}
Aug  1 14:51:25 192.168.1.8 1128 <165>1 2024-08-01T09:21:20.856469Z - - - - - {"MESSAGE":"pam_unix(su-l:session): session opened for user root(uid=0) by sweta(uid=1000)","PRIORITY":"6","SYSLOG_FACILITY":"10","SYSLOG_IDENTIFIER":"su","SYSLOG_PID":"6816","SYSLOG_TIMESTAMP":"Aug  1 14:51:20 ","_AUDIT_LOGINUID":"1000","_AUDIT_SESSION":"5","_BOOT_ID":"a4aae1ad370f427586f59b9f11140774","_CAP_EFFECTIVE":"1ffffffffff","_CMDLINE":"su -","_COMM":"su","_EXE":"/usr/bin/su","_GID":"1000","_HOSTNAME":"debian","_MACHINE_ID":"97f0f0c26cfa4e04bbb07a29a1800ab1","_PID":"6816","_RUNTIME_SCOPE":"system","_SELINUX_CONTEXT":"unconfined\n","_SOURCE_REALTIME_TIMESTAMP":"1722504080853814","_SYSTEMD_CGROUP":"/user.slice/user-1000.slice/session-5.scope","_SYSTEMD_INVOCATION_ID":"c4af78074a364d0491d562993d2d4f50","_SYSTEMD_OWNER_UID":"1000","_SYSTEMD_SESSION":"5","_SYSTEMD_SLICE":"user-1000.slice","_SYSTEMD_UNIT":"session-5.scope","_SYSTEMD_USER_SLICE":"-.slice","_TRANSPORT":"syslog","_UID":"1000","__CURSOR":"s=5e91832a0ef440de854b15cc2fd5f3fc;i=1034069;b=a4aae1ad370f427586f59b9f11140774;m=588fbc76;t=61e9bbbdd1595;x=70242952deb0161c","__MONOTONIC_TIMESTAMP":"1485814902"}

Please suggest if there is any other way to export journald logs to syslog server format or do we have to use transform to get each attribute for RFC5424 https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/syslogexporter/README.md#rfc5424 ?

swetasgit commented 1 month ago

I was able to get some values using transform below:


transform:
    error_mode: ignore
    log_statements:
      - context: log
        statements:
          - set(attributes["appname"], body["_COMM"])
          - set(attributes["hostname"], body["_HOSTNAME"])
          - set(attributes["message"], body["MESSAGE"])
          - set(attributes["priority"], body["PRIORITY"])

But cant we have some feature to parse the journald logs to syslog server in easier way
djaglowski commented 1 month ago

Based on discussion on CNCF slack I believe this is an issue with the exporter and not the receiver.

To generalize the issue a bit, does the exporter have any requirements for the format of the logs it consumes? In other words, does it fail if a certain attribute is missing, etc?

If so, this is unlike how users typically expect to work with exporters. It would be much better if any plog.Logs can be consumed in a generic way, even if some fields of the syslog format must be given default values or remain unutilized.

If this is already the intention of the receiver, then I think this might be a bug.

swetasgit commented 1 month ago

@djaglowski Yes you are right. The issue is with exporter, not consuming logs from journald receiver as it is but only if the attributes are preset or are manually added.

djaglowski commented 1 month ago

If an attribute is missing, the default value is used. The log's timestamp field is used for the syslog message's time.

I think this is a pretty clear bug given that we are not handling missing attributed as stated in the documentation. I'll remove the triage label.