influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.63k stars 5.58k forks source link

xpath_protobuf input throws: invalid memory address or nil pointer dereference #14024

Closed Nirwx closed 1 year ago

Nirwx commented 1 year ago

Relevant telegraf.conf

[agent]
  interval = "5s" 
  metric_batch_size = 5000
  metric_buffer_limit = 30000
  collection_jitter = "0s"
  flush_interval = "5s"
  flush_jitter = "1s"
  precision = ""
  hostname = ""
  omit_hostname = true
  debug = true

[[inputs.file]]
  files = ["/opt/samples/message-3.proto.dump"]
  data_format = "xpath_protobuf"
  xpath_protobuf_file = "sci-message.proto"
  xpath_protobuf_import_paths = [".", "/opt/proto"]
  xpath_protobuf_type = "sci.SciMessage"

[[outputs.file]]
  files = ["/tmp/output.txt"]

Logs from Telegraf

2023-09-29T15:02:49Z I! Loading config: /etc/telegraf/telegraf.conf
2023-09-29T15:02:49Z I! Starting Telegraf 1.26.3
2023-09-29T15:02:49Z I! Available plugins: 235 inputs, 9 aggregators, 27 processors, 22 parsers, 57 outputs, 2 secret-stores
2023-09-29T15:02:49Z I! Loaded inputs: file
2023-09-29T15:02:49Z I! Loaded aggregators: 
2023-09-29T15:02:49Z I! Loaded processors: 
2023-09-29T15:02:49Z I! Loaded secretstores: 
2023-09-29T15:02:49Z I! Loaded outputs: file
2023-09-29T15:02:49Z I! Tags enabled: 
2023-09-29T15:02:49Z I! [agent] Config: Interval:5s, Quiet:false, Hostname:"", Flush Interval:5s
2023-09-29T15:02:49Z D! [agent] Initializing plugins
2023-09-29T15:02:49Z D! [agent] Connecting outputs
2023-09-29T15:02:49Z D! [agent] Attempting connection to [outputs.file]
2023-09-29T15:02:49Z D! [agent] Successfully connected to outputs.file
2023-09-29T15:02:49Z D! [agent] Starting service inputs
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x3cffe24]

goroutine 35 [running]:
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive({0x6764520, 0x400000c708}, {0x4001cdc000, 0x20}, 0x4000cb70e8?, 0x3c9f3a4?, 0x0?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:390 +0x144
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive.func1(0x4000736240, 0x4001cde960, 0x400013fe00, {0x6764520, 0x400000c708}, 0x1d0?, 0x400073e2d0?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:402 +0x16c
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive({0x6764520, 0x400000c708}, {0x4001c1fbc0, 0xc}, 0x4000cb7278?, 0x3c9f3a4?, 0x0?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:403 +0x1d0
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive.func1(0x4000736240, 0x400013fdb0, 0x400013f2c0, {0x6764520, 0x400000c708}, 0xb3?, 0x40001a8319?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:402 +0x16c
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive({0x6764520, 0x400000c708}, {0x400073e2d0, 0xf}, 0x4000cb7408?, 0x3c9f3a4?, 0x0?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:403 +0x1d0
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive.func1(0x4000736240, 0x4000655630, 0x4000654af0, {0x6764520, 0x400000c708}, 0x58a1880?, 0x40004ac301?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:402 +0x16c
github.com/jhump/protoreflect/desc/protoparse.parseToProtoRecursive({0x6764520, 0x400000c708}, {0x40001a8319, 0x11}, 0x400073c100?, 0x0?, 0x400072f5b8?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:403 +0x1d0
github.com/jhump/protoreflect/desc/protoparse.parseToProtosRecursive({0x6764520, 0x400000c708}, {0x4000cb7848, 0x1, 0x0?}, 0x0?, 0x0?)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:366 +0x80
github.com/jhump/protoreflect/desc/protoparse.Parser.ParseFiles({{0x40004ac360, 0x2, 0x2}, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...)
        /go/pkg/mod/github.com/jhump/protoreflect@v1.15.1/desc/protoparse/parser.go:153 +0x194
github.com/influxdata/telegraf/plugins/parsers/xpath.(*protobufDocument).Init(0x400073a120)
        /go/src/github.com/influxdata/telegraf/plugins/parsers/xpath/protocolbuffer_document.go:44 +0xc4
github.com/influxdata/telegraf/plugins/parsers/xpath.(*Parser).Init(0x40006705a0)
        /go/src/github.com/influxdata/telegraf/plugins/parsers/xpath/parser.go:103 +0x5f0
github.com/influxdata/telegraf/models.(*RunningParser).Init(0x67a4710?)
        /go/src/github.com/influxdata/telegraf/models/running_parsers.go:63 +0x3c
github.com/influxdata/telegraf/config.(*Config).addParser(0x40003adce0, {0x5bb58d4, 0x6}, {0x4000a539b0, 0x4}, 0x40004af590)
        /go/src/github.com/influxdata/telegraf/config/config.go:961 +0x4d0
github.com/influxdata/telegraf/config.(*Config).addInput.func1()
        /go/src/github.com/influxdata/telegraf/config/config.go:1174 +0x38
github.com/influxdata/telegraf/plugins/inputs/file.(*File).readMetric(0x40009cdec0, {0x4000752180, 0x21})
        /go/src/github.com/influxdata/telegraf/plugins/inputs/file/file.go:97 +0x288
github.com/influxdata/telegraf/plugins/inputs/file.(*File).Gather(0x40009cdec0, {0x67d8ae0, 0x40001441c0})
        /go/src/github.com/influxdata/telegraf/plugins/inputs/file/file.go:48 +0x90
github.com/influxdata/telegraf/models.(*RunningInput).Gather(0x40004af950, {0x67d8ae0, 0x40001441c0})
        /go/src/github.com/influxdata/telegraf/models/running_input.go:126 +0x54
github.com/influxdata/telegraf/agent.(*Agent).gatherOnce.func1()
        /go/src/github.com/influxdata/telegraf/agent/agent.go:576 +0x30
created by github.com/influxdata/telegraf/agent.(*Agent).gatherOnce
        /go/src/github.com/influxdata/telegraf/agent/agent.go:575 +0xe4

System info

telegraf container version 1.26 to 1.28.1

Docker

FROM telegraf:1.26 COPY telegraf.conf /etc/telegraf/telegraf.conf COPY proto/base /opt/proto COPY samples/mqtt_log/dumps /opt/samples WORKDIR /opt/proto

Steps to reproduce

  1. sudo docker build -t telegraf:test .
  2. sudo docker run -it telegraf:test telegraf

Expected behavior

Protobuf message written to ["/tmp/output.txt"]

Actual behavior

"Invalid Memory Address"

2023-09-29T15:08:04Z D! [agent] Starting service inputs panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x3cffe24]

Additional info

No response

powersj commented 1 year ago

Hi,

Looks to be caused during the xpath parser's Init(), when we pass the files to the github.com/jhump/protoreflect/desc/protoparse library to parse the files. Would you be willing to share both sci-message.proto and /opt/samples/message-3.proto.dump? If you don't want to do it pulibcally, you can send me mail at jpowers at influxdata.com

Nirwx commented 1 year ago

Hi, Thanks. I've sent an email.

powersj commented 1 year ago

Thank you! Will take a look shortly.

powersj commented 1 year ago

Alright, with the files you sent me I built the docker image as follows:

FROM telegraf:1.26
COPY telegraf.conf /etc/telegraf/telegraf.conf
COPY proto/ /opt/proto
COPY message-3.proto.dump /opt/samples/
WORKDIR /opt/proto

The two differences being the two last COPY commands. The first one is a directory containing the protofiles you sent me. The second is the direct dump file you sent me as well.

And when I run:

❯ docker run -it telegraf:test telegraf
2023-09-29T16:11:34Z I! Loading config: /etc/telegraf/telegraf.conf
2023-09-29T16:11:34Z I! Starting Telegraf 1.26.3
2023-09-29T16:11:34Z I! Available plugins: 235 inputs, 9 aggregators, 27 processors, 22 parsers, 57 outputs, 2 secret-stores
2023-09-29T16:11:34Z I! Loaded inputs: file
2023-09-29T16:11:34Z I! Loaded aggregators: 
2023-09-29T16:11:34Z I! Loaded processors: 
2023-09-29T16:11:34Z I! Loaded secretstores: 
2023-09-29T16:11:34Z I! Loaded outputs: file
2023-09-29T16:11:34Z I! Tags enabled: 
2023-09-29T16:11:34Z I! [agent] Config: Interval:5s, Quiet:false, Hostname:"", Flush Interval:5s
2023-09-29T16:11:34Z D! [agent] Initializing plugins
2023-09-29T16:11:34Z D! [agent] Connecting outputs
2023-09-29T16:11:34Z D! [agent] Attempting connection to [outputs.file]
2023-09-29T16:11:34Z D! [agent] Successfully connected to outputs.file
2023-09-29T16:11:34Z D! [agent] Starting service inputs
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file] Could not find "sci.SciMessage"... Known messages:
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file]   ScgSessMgrPubIpc
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file]   com.ruckuswireless.scg.protobuf.icx
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file]   com.ruckuswireless.scg.protobuf.icx
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file]   com.ruckuswireless.scg.protobuf.storage
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file]   com.ruckuswireless.scg.protobuf.storage
2023-09-29T16:11:35Z I! [parsers.xpath_protobuf::file]   google.protobuf
2023-09-29T16:11:35Z E! [inputs.file] Error in plugin: could not instantiate parser: proto: not found
^C2023-09-29T16:11:38Z D! [agent] Stopping service inputs
2023-09-29T16:11:38Z D! [agent] Input channel closed
2023-09-29T16:11:38Z I! [agent] Hang on, flushing any cached metrics before shutdown
2023-09-29T16:11:38Z D! [outputs.file] Buffer fullness: 0 / 30000 metrics
2023-09-29T16:11:38Z I! [agent] Stopping running outputs
2023-09-29T16:11:38Z D! [agent] Stopped Successfully

That is because I need to change xpath_protobuf_type = "sci.SciMessage" to xpath_protobuf_type = "SciMessage"

Is it possible to remove Docker builds from the equation and have you run it locally as well? Ideally, with v1.28.1 with the following config:

[agent]
  debug = true

[[inputs.file]]
  files = ["samples/mqtt_log/dumps/message-3.proto.dump"]
  data_format = "xpath_protobuf"
  xpath_protobuf_import_paths = [".", "proto/base"]
  xpath_protobuf_file = "sci-message.proto"
  xpath_protobuf_type = "SciMessage"
  xpath_print_document = true

[[outputs.file]]

With that I was able to get the XML document to print and you would be ready to start creating your config to parse and generate metrics.

powersj commented 1 year ago

Looking at the upstream library, the trace looks very much like https://github.com/jhump/protoreflect/issues/572

Nirwx commented 1 year ago

Hi, I still run into the same issue with the docker container. I've also tried locally v1.28.1 using the telegraf conf you shared, and same error.

Note that I'm running this on a Mac M1 (Sonoma Version 14.0).

srebhan commented 1 year ago

@powersj can you please try to update protoreflect to https://github.com/jhump/protoreflect/pull/574 and see if the issue goes away?

srebhan commented 1 year ago

Need https://github.com/jhump/protoreflect/issues/572 to be fixed...

srebhan commented 1 year ago

@Nirwx just to be really sure: Did you set xpath_protobuf_type = "SciMessage" as @powersj suggested? That is the correct message name...

srebhan commented 1 year ago

@Nirwx please check if PR #14085 fixes your issue as it contains a fixed version of github.com/jhump/protoreflect. A binary should be available in the PR once CI finished all tests successfully.