influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.59k stars 5.56k forks source link

Panic on string fields with unescaped quotes: Slice bounds out of range #3326

Closed kevinpreynolds closed 6 years ago

kevinpreynolds commented 7 years ago

Directions

GitHub Issues are reserved for actionable bug reports and feature requests. General questions should be asked at the InfluxData Community site.

Before opening an issue, search for similar bug reports or feature requests on GitHub Issues. If no similar issue can be found, fill out either the "Bug Report" or the "Feature Request" section below. Erase the other section and everything on and above this line.

Please note, the quickest way to fix a bug is to open a Pull Request.

Bug report

C:\telegraf-1.4.0>telegraf.exe -config telegraf.conf panic: runtime error: slice bounds out of range

goroutine 73 [running]: github.com/influxdata/telegraf/metric.(metric).Fields(0xc04261f780, 0xc042635580) /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/metric/metric.go:356 +0x5cf github.com/influxdata/telegraf/plugins/inputs/mqtt_consumer.(MQTTConsumer).receiver(0xc0423a4b00) /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/mqtt_consumer/mqtt_consumer.go:176 +0x2e2 created by github.com/influxdata/telegraf/plugins/inputs/mqtt_consumer.(*MQTTConsumer).Start /home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/mqtt_consumer/mqtt_consumer.go:131 +0x225

Relevant telegraf.conf:

[[inputs.mqtt_consumer]]
  servers = ["localhost:1883"]
  ## MQTT QoS, must be 0, 1, or 2
  qos = 0

  ## Topics to subscribe to
  topics = ["mytopic"]

  # if true, messages that can't be delivered while the subscriber is offline
  # will be delivered when it comes back (such as on service restart).
  # NOTE: if true, client_id MUST be set
  persistent_session = false
  # If empty, a random client ID will be generated.
  client_id = ""

  data_format = "influx"

System info:

Telegraf 1.4.0, Windows 10

Steps to reproduce:

Send an InfluxDB line protocol line with field values with both strings and floats. It works fine when the field values are surrounded with double quotes to be treated as a string, but breaks when float field values have no double quotes.

Expected behavior:

No error and the line gets processed, Telegraf keeps running

Actual behavior:

Telegraf quits and produces this error message. No additional errors are added to the log.

danielnelson commented 7 years ago

Thanks for opening the bug report, would you be able to send me an example of the line protocol that is causing the crash?

kevinpreynolds commented 7 years ago

This works: IMW-4,event=Tracker\ Error count="1",text="Tracker error: (18825629) - Code: 13",lineNumber="229",procedure="DoOnError" 1506550891007000000

This doesn't: IMW-4,event=Tracker\ Error count=1,text="Tracker error: (18825629) - Code: 13",lineNumber=229,procedure="DoOnError" 1506550891007000000

danielnelson commented 7 years ago

I am unable to reproduce on Linux publishing running:

mosquitto_pub -t telegraf -m 'IMW-4,event=Tracker\ Error count=1,text="Tracker error: (18825629) - Code: 13",lineNumber=229,procedure="DoOnError" 1506550891007000000'

Do you have just a single line in each message? I know there are some issues with DOS line endings.

Perhaps you can try publishing with this command?

kevinpreynolds commented 7 years ago

There's just a single line in each message.

Messages are being published out of ActiveMQ through the MQTT endpoint, also running on Windows.

I'll see if there are any DOS line endings in the lines that get published to telegraf and replace them.

kevinpreynolds commented 7 years ago

It fails using both ActiveMQ and Mosquitto running on Windows. Hidden characters have also been removed from the strings. I'll try running Mosquitto on Linux.

danielnelson commented 7 years ago

I could make you a build with an extra logging statement that you could run in Windows to capture the message as Telegraf sees it, I think that would give us the info we need.

kevinpreynolds commented 7 years ago

Thanks, that would be helpful. I installed Mosquitto on Ubuntu and used that as the broker. It still causes that same error.

danielnelson commented 7 years ago

This build should log one line per message telegraf-1.5.0~e1468b8_windows_amd64.zip

kevinpreynolds commented 7 years ago

telegraf.log

Here's the log file using the new build.

danielnelson commented 7 years ago

Thank you, I can reproduce the crash with this line so it should be fairly straightforward to fix:

IMW-4,event=Tracker\ Error count=1.0,text="Tracker error: (-1) - Failed to identify tracker type from "172.18.111.31".Please check network settings to verify Tracker @ "172.18.111.31" is reachable. If tracker is in WiFi mode, make sure you are connected to the tracker's SSID. Alternately, use the ",lineNumber=229.0,procedure="DoOnError" 1498077493081000000

The line should be rejected as invalid line protocol, because the field needs internal double quotes escaped (make sure to escape " and \ inside string fields):

-text="Tracker error: (-1) - Failed to identify tracker type from "172.18.111.31".Please check network settings to verify Tracker @ "172.18.111.31" is reachable. If tracker is in WiFi mode, make sure you are connected to the tracker's SSID. Alternately, use the "
+text="Tracker error: (-1) - Failed to identify tracker type from \"172.18.111.31\".Please check network settings to verify Tracker @ \"172.18.111.31\" is reachable. If tracker is in WiFi mode, make sure you are connected to the tracker's SSID. Alternately, use the "
kevinpreynolds commented 7 years ago

Thank you. Internal double quotes need to be escaped as well.

danielnelson commented 7 years ago

Let's leave this open until the panic is fixed.