open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.13k stars 2.41k forks source link

[receiver/syslog] Syslog Receiver fails to parse long messages, even with a `max_log_size` set #33182

Closed sinkingpoint closed 3 months ago

sinkingpoint commented 6 months ago

Component(s)

receiver/syslog

What happened?

Description

When using the syslog receiver, we can only parse messages up to the default maximum length (8192 octets), even with a max_log_size set much higher.

Steps to Reproduce

  1. Create a receiver with the provided config (note the max_log_size of 100MiB)
  2. Send a message in longer than 8192 characters
  3. Observe an error: message too long to parse. was size 40366, max length 8192

Expected Result

The message should parse properly

Actual Result

The message fails to parse

Collector version

v0.100.0

Environment information

Environment

OS: Debian Bookworm

OpenTelemetry Collector configuration

receivers:
  syslog:
    protocol: rfc5424
    enable_octet_counting: true
    tcp:
      listen_address: :4278
      max_log_size: 100000000 # 100MiB
exporters:
  debug:
service:
  pipelines:
    logs:
      receivers: [syslog]
      exporters: [debug]

Log output

{"level":"error","ts":1716356887.9432147,"caller":"helper/transformer.go:101","msg":"Failed to process entry","kind":"receiver","name":"syslog/db","data_type":"logs","operator_id":"syslog_input_internal_parser","operator_type ":"syslog_parser","error":"message too long to parse. was size 40366, max length 8192","action":"send","stacktrace":"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*TransformerOperator).HandleEntryError\\n\\tgithub.co m/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.99.0/operator/helper/transformer.go:101\\ngithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ParseWith\\n\\tgithub.com/open-telemetry/opentelemetry-collect or-contrib/pkg/stanza@v0.99.0/operator/helper/parser.go:140\\ngithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ProcessWithCallback\\n\\tgithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.99.0/ope rator/helper/parser.go:112\\ngithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/syslog.(*Parser).Process\\n\\tgithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.99.0/operator/parser/syslog/parser.go:54\\ngithub.com/ open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*WriterOperator).Write\\n\\tgithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.99.0/operator/helper/writer.go:53\\ngithub.com/open-telemetry/opentelemetry-collector-contrib/p kg/stanza/operator/input/tcp.(*Input).handleMessage\\n\\tgithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.99.0/operator/input/tcp/input.go:191\\ngithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp.(*Input).goHan dleMessages.func1\\n\\tgithub.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.99.0/operator/input/tcp/input.go:152"}

Additional context

This seems to be because we aren't parsing a value into here: https://github.com/influxdata/go-syslog/blob/66067a10754ae90b9540d5312989ae685413c4fe/octetcounting/parser.go#L46 so we get stuck with the default limit

github-actions[bot] commented 6 months ago

Pinging code owners:

frzifus commented 6 months ago

As far as I understand that part, the parser has no option the pass the maxSize information to to any parser?

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/902d846079474a316334ddb2a37ffaa84c3c5462/pkg/stanza/operator/parser/syslog/parser.go#L29-L36

Looking at this construction part non of those takes a maxsize into account.

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/902d846079474a316334ddb2a37ffaa84c3c5462/pkg/stanza/operator/parser/syslog/parser.go#L88-L113

The used version of github.com/influxdata/go-syslog/v3/rfc5424 doesnt even offer an option that can be set.

djaglowski commented 6 months ago

max_log_size is a feature of the TCP input component, but it doesn't apply to syslog.

The used version of github.com/influxdata/go-syslog/v3/rfc5424 doesnt even offer an option that can be set.

I looked into this further and found that go-syslog justifies the hard limit based on RFC 5425 Section 4.3.1. My reading of that section is that it is the minimum which the library should support but it is not prescriptive about it being a maximum.

sinkingpoint commented 6 months ago

@djaglowski considering that that repo has been archived, would it make sense to fork it here?

djaglowski commented 6 months ago

Actually I'm happy to see that the original author has recently created a fork and is making updates again! We should definitely switch in my opinion. https://github.com/leodido/go-syslog.

andrzej-stencel commented 6 months ago

If I'm reading this correctly, the v4 release from leodido/go-syslog allows us to fix this issue, as it contains the WithMaxMessageLength function introduced in https://github.com/influxdata/go-syslog/pull/39 that we can call when instatiating the parser. Is my thinking correct?

andrzej-stencel commented 6 months ago

PR switching the dependency to the fork: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33205.

bacherfl commented 5 months ago

Hi! I would like to pick this issue up if still available

bacherfl commented 5 months ago

@djaglowski I went ahead and created a draft PR making use of the new option the updated library. I do have some open questions which I have added to the PR description - appreciate any feedback there

github-actions[bot] commented 3 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

djaglowski commented 3 months ago

Resolved by #33777.