influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.6k stars 5.57k forks source link

Telegraf not processing mqtt data #13530

Closed marioverhaeg closed 1 year ago

marioverhaeg commented 1 year ago

Relevant telegraf.conf

[agent]
  collection_jitter = "0s"
  precision = ""
  debug = true
  logfile = "/var/log/telegraf/telegraf.log"
  hostname = ""
  omit_hostname = false
[[outputs.file]]
  files = ["stdout", "/var/log/telegraf/output.log"]
  data_format = "json"
[[outputs.influxdb_v2]]
  urls = ["http://192.168.x.x:8086"]
  token = "xx"
  organization = "Mario"
  bucket = "Bosch"
[[inputs.mqtt_consumer]]
  servers = ["tcp://192.168.x.x:1883"]
  username = "x"
  password = "x"
  topics = [
    "Bosch/CAM137/onvif-ej/RuleEngine/CountAggregation/Counter/&1/Molenhofweg"
  ]
  data_format = "json"

Logs from Telegraf

2023-06-30T07:43:02Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:12Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:12Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:22Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:22Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:32Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:32Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:42Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:42Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:52Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2023-06-30T07:43:52Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics

System info

Debian Bullseye, InfluxDB2, Telegraf 1.27

Docker

No response

Steps to reproduce

  1. Configure MQTT input
  2. Configure InfluxDB_v2 output
  3. Configure Mosquitto
  4. Validate Telegraf subscription in Mosquitto
  5. Validate Telegraf InfluxDB_v2 output configuration

Expected behavior

Telegraf processes MQTT messages. Debug logging contains reason why messages are not processed.

Actual behavior

Telegraf does not process MQTT messages. No error messages in Debug logging.

Additional info

Mosquitto logs: 8110927: Sending PUBLISH to Telegraf-Consumer-M0uQg (d0, q0, r0, m0, 'Bosch/CAM137/onvif-ej/RuleEngine/CountAggregation/Counter/&1/xx', ... (116 bytes)) 1688110972: Received PINGREQ from Telegraf-Consumer-M0uQg 1688110972: Sending PINGRESP to Telegraf-Consumer-M0uQg 1688111032: Received PINGREQ from Telegraf-Consumer-M0uQg 1688111032: Sending PINGRESP to Telegraf-Consumer-M0uQg

srebhan commented 1 year ago

@marioverhaeg your config is not valid TOML so I wonder if this is your whole config!? If so, the data_format defaults to "influx"... Are you sure your messages are in Influx line-protocol!?!? You should see a parser warning if that's not the case...

marioverhaeg commented 1 year ago

Hi @srebhan , thank you for your comment. I've updated my initial post with the complete configuration file. I already ran into the data_format setting and configured it to JSON. You are right: if this would be the issue I would expect to see a parser warning, but I'm not seeing that. That's the strange thing: I'm not seeing anything in the debug logging while I've proven that the MQTT configuration is correct (Mosquitto publishes updates to the Telegraf-Consumer) and the InfluxDB configuration is correct (Telegraf CPU metrics are written).

powersj commented 1 year ago

I've proven that the MQTT configuration is correct (Mosquitto publishes updates to the Telegraf-Consumer)

Your list of topics, an empty list, is still invalid ~TOML~ config.

Your logs are from the mosquitto server, which demonstrate that the telegraf mqtt client is connected and responding to pings. The first message about getting sending a publish also indicates that the communication is working.

My suggestion is to grab the artifacts from https://github.com/influxdata/telegraf/pull/13478 which will have the debug messages from the MQTT client.

I would also ask that you provide an example of your data as the JSON parser might not be able to parse your metric and produce a meaningful metric and as such why nothing comes out.

This seems like a much better set of questions for the forums or slack as well as it is not clear this is an actual issue with Telegraf.

marioverhaeg commented 1 year ago

My configuration is valid TOML, my copy/paste skills need some work ;-). I've updated the configuration in the original post.

JSON of the topic, observed in MQTT explorer: { "UtcTime": "2023-07-06T05:16:29.773Z", "Source": { "VideoSource": "1", "Rule": "Molenhofweg" }, "Data": { "Count": "1146" } }

If the JSON parser would not be able to extract a metric I would expect an error in de log. This is what I also understand should be normal behavior.

As another test I've moved the MQTT broker to another system and captured network traffic between the broker and telegraf system. The telegraf system receives the publish from the MQTT broker: 1 0.000000000 192.168.20.104 → 192.168.20.68 MQTT 291 Publish Message [Bosch/CAM137/onvif-ej/RuleEngine/CountAggregation/Counter/&1/Molenhofweg] 2 0.000034864 192.168.20.68 → 192.168.20.104 TCP 54 50018 → 1883 [ACK] Seq=1 Ack=238 Win=501 Len=0 3 4.878121241 192.168.20.68 → 192.168.20.104 MQTT 56 Ping Request 4 4.878361712 192.168.20.104 → 192.168.20.68 MQTT 56 Ping Response 5 4.878375498 192.168.20.68 → 192.168.20.104 TCP 54 50018 → 1883 [ACK] Seq=3 Ack=240 Win=501 Len=0 6 8.468639912 192.168.20.104 → 192.168.20.68 MQTT 248 Publish Message [Bosch/CAM137/onvif-ej/RuleEngine/CountAggregation/Counter/&1/Molenhofweg] 7 8.468705744 192.168.20.68 → 192.168.20.104 TCP 54 50018 → 1883 [ACK] Seq=3 Ack=434 Win=501 Len=0

I still don't see anything in the telegraf logging at that time (I'm running the #13478 artifacts). 2023-07-06T06:33:30Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:33:40Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:33:40Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:33:50Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:33:50Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:00Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:00Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:10Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:10Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:20Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:20Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:30Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:30Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:40Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics 2023-07-06T06:34:40Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics

powersj commented 1 year ago

If the JSON parser would not be able to extract a metric I would expect an error in de log. This is what I also understand should be normal behavior.

There will only be a log if it fails to parse, not if it fails to create any metric.

I still don't see anything in the telegraf logging at that time (I'm running the https://github.com/influxdata/telegraf/pull/13478 artifacts).

Can you remove your logging to a file and see what is reported to stdout? There should be hundreds of lines of debug messages from the mqtt client that would confirm what it is getting.

marioverhaeg commented 1 year ago

There will only be a log if it fails to parse, not if it fails to create any metric. Then my bug would actually be an enhancement request. If Telegraf is connected to an MQTT broker and is receiving messages on a topic, I would expect some indication that it cannot extract information from that message or that it has dropped a message because it couldn't handle it.

I have amended my configuration with the json_v2 parser and I'm seeing data now: [[inputs.mqtt_consumer.json_v2]] measurement_name = "Counter" [[inputs.mqtt_consumer.json_v2.field]] path = "Data.Count" type = "int"

My problem is solved, but the request to add some logging on this still stands. I leave it up to you to evaluate if you think this is useful. Thank you for the help!

powersj commented 1 year ago

@marioverhaeg - I do think that this scenario has come up from time to time. I have debated if a message should be added but I also worry about lots of additional messages for some users. As a result, what do you think about a debug message?

@srebhan thoughts on this as well?

srebhan commented 1 year ago

@powersj I think this should be a debug message (maybe only once) at max. There are some use-cases where you want to selectively parse only certain messages but ignore others so an error would break those guys....

powersj commented 1 year ago

@srebhan can you take a look at https://github.com/influxdata/telegraf/pull/13574

marioverhaeg commented 1 year ago

With a debug message I would have figured out what was going on on my own. The first thing I did was set the log-level to debug.