influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.4k stars 5.54k forks source link

strconv.ParseFloat: parsing "???": invalid syntax #4827

Closed Teumaat closed 5 years ago

Teumaat commented 5 years ago

Telegraf 1.8.1 (git: HEAD ae9efb2f) Ubuntu 16.04.5 LTS

telegraf.conf [[inputs.mqtt_consumer]] ... data_format = "value" data_type = "float"

syslog error 2018-10-08T07:59:30Z E! Error in plugin [inputs.mqtt_consumer]: E! MQTT Parse Error message: ??? error: strconv.ParseFloat: parsing "???": invalid syntax

MQTT value events/cv/otmonitor/modulation ???

I am running OpenTherm monitor (http://otgw.tclcode.com/) to publish stats about my HVAC to MQTT. One of the values contains ??? which Telegraf does not seem to like. This is not a problem for me but Telegraf seems to hang on this parse error and does nothing further.

glinton commented 5 years ago

Do you have a simple way to reproduce this (config example/test)? thanks

danielnelson commented 5 years ago

Does this exact topic contain good data as well? If not, perhaps we can just exclude this topic?

Teumaat commented 5 years ago

Do you have a simple way to reproduce this (config example/test)? thanks

What do you need? I have included a snippet from my config file and the MQTT topic with value.

Does this exact topic contain good data as well? If not, perhaps we can just exclude this topic?

No this topic always contains ??? because my HVAC does not supply this information on the OpenTherm bus. How can I exclude a topic? Still this needs to be fixed I think. I don't see why a single incorrect value can break the entire workings of Telegraf.

danielnelson commented 5 years ago

Using the topics option can you not subscribe to this topic? I don't think there is an way to exclude topics in MQTT but maybe you can select around this topic.

  ## Topics to subscribe to
  topics = [
    "telegraf/host01/cpu",
    "telegraf/+/mem",
    "sensors/#",
  ]

I don't see why a single incorrect value can break the entire workings of Telegraf.

I don't see any reason in the current code that this would cause processing to stop, but we will try to replicate with a real MQTT broker. However, since you are subscribed to a topic to parse messages that contains data formatted in an unexpected way I don't think we would remove the log message.

Teumaat commented 5 years ago

I think I might be having issue 4594 because Telegraf just stops reporting. When I try to kill Telegraf it also does not want to die. Only a kill -9 does the trick.

Yes seems that way: 2018-10-10T14:05:10Z I! MQTT Client Connected 2018-10-10T14:05:10Z I! MQTT Client Connected

danielnelson commented 5 years ago

I put together a PR for the connection/reconnection issues #4846, do you think you could test with one of the builds here: https://github.com/influxdata/telegraf/pull/4846#issuecomment-428762836

On the issue where you cannot shutdown, I believe I have a fix for that too but the PR is not quite ready. There are a few open issues that are describe this or are closely related: #4610, #4457. I'm planning to have this fixed in 1.9 as it ended up requiring some rather large changes.

Teumaat commented 5 years ago

Nice :) I've installed the deb and everything seems to be working fine for now. I have restarted Telegraf a few times and I have not seen any double connects so that fix is working :)