influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.6k stars 5.57k forks source link

[inputs.mqtt_consumer] Error in plugin: connection lost: EOF #12046

Closed psxde closed 2 years ago

psxde commented 2 years ago

Relevant telegraf.conf

[[inputs.mqtt_consumer]]
  servers = ["tcp://mqtt.domain.com:1883"]
  topics = [
    "ecu/r35/out"
  ]
  topic_tag = "topic"
  connection_timeout = "30s"
  max_undelivered_messages = 1500
  persistent_session = true
  client_id = "telegraf_mqtt"
  username = "user"
  password = "pass"
  name_override = "mqtt"
  data_format = "json"

[[inputs.mqtt_consumer]]
  servers = ["tcp://mqtt.domain.com:1883"]
  username = "user"
  password = "pass"
  client_id = "telegraf_tasmota"
  topics = ["tele/+/SENSOR"]
  data_format = "json"
  name_override = "tasmota"

[[inputs.mqtt_consumer]]
  servers = ["tcp://mqtt.domain.com:1883"]
  username = "user"
  password = "pass"
  client_id = "telegraf_test"
  topics = ["test/sensor/#"]
  data_format = "value"
  data_type = "float"
  topic_tag = ""

[[inputs.mqtt_consumer.topic_parsing]]
  topic = "test/+/+/+"
  measurement = "measurement/_/_/_"
  tags = "_/_/sensor/_"

Logs from Telegraf

2022-10-18T17:15:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:15:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:15:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:16:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:16:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:16:07Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:17:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:17:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF
2022-10-18T17:17:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: EOF

System info

Telegraf 1.23.0, FreeBSD 13.1-RELEASE-p1

Docker

No response

Steps to reproduce

  1. start mosquitto
  2. start telegraf
  3. send metadata to mosquitto
  4. look into influx db

Expected behavior

Metadata from mqtt is written in influxdb

Actual behavior

Metadata of client_id telegraf_test is written in database, but either telegraf_mqtt or telegraf_tasmota is written into database. Its always the same mosquitto server. Currently telegraf_mqtt is not working since yesterday

Additional info

mosquitto log:

1666113690: New connection from x.x.x.x:3566 on port 1883. 1666113690: New connection from x.x.x.x:18125 on port 1883. 1666113690: New client connected from x.x.x.x:3566 as telegraf_tasmota (p2, c1, k60, u'mqtt'). 1666113690: New client connected from x.x.x.x:18125 as telegraf_test (p2, c1, k60, u'mqtt'). 1666113691: New connection from x.x.x.x:14938 on port 1883. 1666113720: Client telegraf_mqtt closed its connection. 1666113720: Client telegraf_tasmota closed its connection. 1666113722: Client telegraf_test closed its connection. 1666113750: New connection from x.x.x.x:46840 on port 1883. 1666113750: New connection from x.x.x.x:49499 on port 1883. 1666113750: New client connected from x.x.x.x:46840 as telegraf_tasmota (p2, c1, k60, u'mqtt'). 1666113750: New client connected from x.x.x.x:49499 as telegraf_test (p2, c1, k60, u'mqtt'). 1666113750: New connection from x.x.x.x:55590 on port 1883. 1666113750: New client connected from x.x.x.x:55590 as telegraf_mqtt (p2, c1, k60, u'mqtt'). 1666113780: Client telegraf_tasmota closed its connection. 1666113780: Client telegraf_mqtt closed its connection.

powersj commented 2 years ago

Currently telegraf_mqtt is not working since yesterday

This was previously working and suddenly stopped working? Did you update telegraf to a new version? If not, you need to determine what changed.

Based on that error message, it sounds like you have something new in your networking or data that has changed.

psxde commented 2 years ago

Hi, the problem is that nothing has changed in the configuration. The last time it changed between telegraf_mqtt and telegraf_tasmota after restarting telegraf, but now it changes also over night without restart and from one moment to the other there are no data written anymore to influx. Also there is no helpful log message to determine what the problem is :(

powersj commented 2 years ago

nothing has changed in the configuration

If you haven't changed your config and you haven't changed telegraf versions, then this looks and sounds like something with the mqtt server or your network and not telegraf, right?

With the logs from mqtt:

1666113720: Client telegraf_mqtt closed its connection. 1666113720: Client telegraf_tasmota closed its connection. 1666113722: Client telegraf_test closed its connection.

I would see if you can run mosquitto with more verbose logs to see the full connection details to see what the server sees in more detail.

Searching around and there are some similar reports on the mosquitto repo (e.g #2381 - networking?, #2201 - mosquitto version)

Based on what you have shown I don't see how this is an issue with Telegraf. If you want to get additional help, you could post this on our forums or slack to see if other users have come across this.

psxde commented 2 years ago

Hmmm, there is no special verbose message in mosquitto. As soon as the client (telegraf) closes the connection, the above debug message appears in the telegraf log.

1666129560: New client connected from x.x.x.x:50028 as telegraf_mqtt (p2, c0, k60, u'mqtt'). 1666129560: No will message specified. 1666129560: Sending CONNACK to telegraf_mqtt (1, 0) 1666129560: New client connected from x.x.x.x:7681 as telegraf_tasmota (p2, c1, k60, u'mqtt'). 1666129560: No will message specified. 1666129560: Sending CONNACK to telegraf_tasmota (0, 0) 1666129560: Received SUBSCRIBE from telegraf_test 1666129560: test/sensor/# (QoS 0) 1666129560: telegraf_test 0 test/sensor/# 1666129560: Sending SUBACK to telegraf_test 1666129560: Received SUBSCRIBE from telegraf_tasmota 1666129560: tele/+/SENSOR (QoS 0) 1666129560: telegraf_tasmota 0 tele/+/SENSOR 1666129560: Sending SUBACK to telegraf_tasmota 1666129571: Received PUBLISH from test-aabbccddeeff (d0, q0, r1, m0, 'test/sensor/temperature/state', ... (4 bytes)) 1666129571: Sending PUBLISH to telegraf_test (d0, q0, r0, m0, 'test/sensor/temperature/state', ... (4 bytes)) 1666129590: Client telegraf_mqtt closed its connection. 1666129590: Client telegraf_tasmota closed its connection.

Currently all three subscriptions are working, but the telegraf EOF error persists. It not seems to say anything about if it is working or not. I will try it again tomorrow.

psxde commented 2 years ago

Thank you I think I have found the problem: There was a reverse proxy in between which has a timeout after 30 seconds. If the sensor sends values every minute and telegraf connects every minute in the wrong 30 seconds, they miss each other. I have now fixed the connection timeout and it works.

powersj commented 2 years ago

Nice, thanks for following up!