grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.32k stars 180 forks source link

Alloy go_routine leak in UDP collector. #282

Open tristanmorgan opened 9 months ago

tristanmorgan commented 9 months ago

What's wrong?

There seems to be a go_routine leak dealing with a UDP SYSLOG collector. related to promtail#9726

Steps to reproduce

  1. Started Loki (2.9.3)
  2. Started Grafana Agent (0.38.1 arm64)
  3. Configure a syslog collector (UDP)
  4. wait for go_routines to build up.

System information

Linux 6.1.21-v8+ aarch64

Software version

Grafana Agent 0.38.1

Configuration

logs:
          configs:
          - name: default
            positions:
              filename: /tmp/positions.yaml
            clients:
              - url: http://10.0.0.64:8080/loki/api/v1/push
            scrape_configs:
              - job_name: syslog-udp
                syslog:
                  listen_address: 0.0.0.0:514
                  listen_protocol: udp
                  idle_timeout: 60s
                  label_structured_data: yes
                  labels:
                    job: "syslog"
                relabel_configs:
                  - source_labels: ['__syslog_message_hostname']
                    target_label: 'host'
                  - source_labels: ['__syslog_message_app_name']
                    target_label: 'appname'
                  - source_labels: ["__syslog_message_severity"]
                    target_label: "severity"
                  - source_labels: ["__syslog_message_facility"]
                    target_label: "facility"

Logs

# from /debug/pprof/goroutine?debug=1
goroutine profile: total 257
73 @ 0x43f618 0x450a38 0x483664 0x2d070f0 0x6567d4 0x6562a8 0x656ec0 0x6572b4 0x6574bc 0x2cfe3ec 0x2cfe990 0x2cff6fc 0x2d01550 0x2d06330 0x474264
#   0x483663    io.(*pipe).read+0x83                                        /usr/local/go/src/io/pipe.go:57
#   0x2d070ef   io.(*PipeReader).Read+0x2f                                  /usr/local/go/src/io/pipe.go:136
#   0x6567d3    bufio.(*Reader).Read+0x103                                  /usr/local/go/src/bufio/bufio.go:230
#   0x6562a7    bufio.(*Reader).fill+0xf7                                   /usr/local/go/src/bufio/bufio.go:113
#   0x656ebf    bufio.(*Reader).ReadSlice+0x2f                                  /usr/local/go/src/bufio/bufio.go:379
#   0x6572b3    bufio.(*Reader).collectFragments+0x63                               /usr/local/go/src/bufio/bufio.go:454
#   0x6574bb    bufio.(*Reader).ReadBytes+0x1b                                  /usr/local/go/src/bufio/bufio.go:481
#   0x2cfe3eb   github.com/leodido/ragel-machinery/parser.(*DelimitedReader).Read+0xbb              /go/pkg/mod/github.com/leodido/ragel-machinery@v0.0.0-20181214104525-299bdde78165/parser/arbitrary_reader.go:63
#   0x2cfe98f   github.com/leodido/ragel-machinery/parser.(*Parser).Parse+0x4f                  /go/pkg/mod/github.com/leodido/ragel-machinery@v0.0.0-20181214104525-299bdde78165/parser/parser.go:71
#   0x2cff6fb   github.com/influxdata/go-syslog/v3/nontransparent.(*machine).Parse+0x1fb            /go/pkg/mod/github.com/influxdata/go-syslog/v3@v3.0.1-0.20210608084020-ac565dc76ba6/nontransparent/parser.go:259
#   0x2d0154f   github.com/grafana/loki/clients/pkg/promtail/targets/syslog/syslogparser.ParseStream+0x2ff  /go/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/syslog/syslogparser/syslogparser.go:27
#   0x2d0632f   github.com/grafana/loki/clients/pkg/promtail/targets/syslog.(*UDPTransport).handleRcv+0x13f /go/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/syslog/transport.go:383
tpaschalis commented 8 months ago

For anyone looking at this: we can take some inspiration by the original proposed fix on the Promtail side: https://github.com/grafana/loki/pull/9743

github-actions[bot] commented 7 months ago

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!

rfratto commented 5 months ago

Hi there :wave:

On April 9, 2024, Grafana Labs announced Grafana Alloy, the spirital successor to Grafana Agent and the final form of Grafana Agent flow mode. As a result, Grafana Agent has been deprecated and will only be receiving bug and security fixes until its end-of-life around November 1, 2025.

To make things easier for maintainers, we're in the process of migrating all issues tagged variant/flow to the Grafana Alloy repository to have a single home for tracking issues. This issue is likely something we'll want to address in both Grafana Alloy and Grafana Agent, so just because it's being moved doesn't mean we won't address the issue in Grafana Agent :)

tristanmorgan commented 5 months ago

GoRoutines from Alloy v1.0.0

goroutine profile: total 318
71 @ 0x442618 0x455bc8 0x4caa64 0x3ba2cb0 0x3ba2c91 0x47c9c4
#   0x4caa63    io.(*pipe).read+0x83                                                    /usr/local/go/src/io/pipe.go:57
#   0x3ba2caf   io.(*PipeReader).Read+0xff                                              /usr/local/go/src/io/pipe.go:134
#   0x3ba2c90   github.com/grafana/alloy/internal/component/loki/source/syslog/internal/syslogtarget.(*UDPTransport).handleRcv+0xe0 /src/alloy/internal/component/loki/source/syslog/internal/syslogtarget/transport.go:437