influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.7k stars 5.59k forks source link

parse float error strconv in inputs.exec #11531

Closed kurtbeil01 closed 2 years ago

kurtbeil01 commented 2 years ago

Relevant telegraf.conf

# # Configuration for sending metrics to InfluxDB 2.0
 [[outputs.influxdb_v2]]
#   ## The URLs of the InfluxDB cluster nodes.
#   ##
#   ## Multiple URLs can be specified for a single cluster, only ONE of the
#   ## urls will be written to each interval.
#   ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
#   urls = ["http://127.0.0.1:8086"]
    urls = ["https://FQDN-NAME:8086"]
#
#   ## Token for authentication.
   token = "valid token"
#
#   ## Organization is the name of the organization you wish to write to.
   organization = "ORG-NAME"
#
#   ## Destination bucket to write into.
   bucket = "BUCKET-NAME"
#
#   ## The value of this tag will be used to determine the bucket.  If this
#   ## tag is not set the 'bucket' option is used as the default.
#   # bucket_tag = ""
#
#   ## If true, the bucket tag will not be added to the metric.
#   # exclude_bucket_tag = false
#
#   ## Timeout for HTTP messages.
#   # timeout = "5s"
#
#   ## Additional HTTP headers
#   # http_headers = {"X-Special-Header" = "Special-Value"}
#
#   ## HTTP Proxy override, if unset values the standard proxy environment
#   ## variables are consulted to determine which proxy, if any, should be used.
#   # http_proxy = "http://corporate.proxy:3128"
#
#   ## HTTP User-Agent
#   # user_agent = "telegraf"
#
#   ## Content-Encoding for write request body, can be set to "gzip" to
#   ## compress body or "identity" to apply no encoding.
#   # content_encoding = "gzip"
#
#   ## Enable or disable uint support for writing uints influxdb 2.0.
#   # influx_uint_support = false
#
#   ## Optional TLS Config for use on HTTP connections.
#   # tls_ca = "/etc/telegraf/ca.pem"
#   # tls_cert = "/etc/telegraf/cert.pem"
#   # tls_key = "/etc/telegraf/key.pem"
#   ## Use TLS but skip chain & host verification
#   # insecure_skip_verify = false
##################################################################################################
# TD Performance mit Powershell
  [[inputs.exec]]
    commands = ["powershell.exe -NoProfile 'D:/programme/telegraf/td-perf.ps1'"]
    csv_column_types = ["string","float","int"]
    csv_tag_columns = ["SYSTEM"]
    interval = "5m"
    timeout = "290s"
    data_format = "csv"
    csv_skip_rows = 0
    csv_header_row_count = 1
    csv_delimiter = ";"
    name_suffix = "_MSTR_PERF"

Logs from Telegraf

2022-07-20T10:52:51Z I! Loaded inputs: cpu disk diskio exec (2x) mem processes swap system
2022-07-20T10:52:51Z I! Loaded aggregators: 
2022-07-20T10:52:51Z I! Loaded processors: 
2022-07-20T10:52:51Z I! Loaded outputs: influxdb_v2
2022-07-20T10:52:51Z I! Tags enabled: host=WIKI-PROD-FRA
2022-07-20T10:52:51Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"WIKI-PROD-FRA", Flush Interval:10s
2022-07-20T10:52:51Z W! [inputs.processes] Current platform is not supported
2022-07-20T11:00:08Z E! [inputs.exec] Error in plugin: column type: parse float error strconv.ParseFloat: parsing "RUNTIME": invalid syntax
2022-07-20T11:00:17Z E! [inputs.exec] Error in plugin: column type: parse float error strconv.ParseFloat: parsing "RUNTIME": invalid syntax

System info

Telegraf 1.21.0, Windows Server 2019

Docker

No response

Steps to reproduce

  1. inputs.exec executes a Powershell script which produces output in csv format

"SYSTEM";"RUNTIME";"ROWS" TD;23.6520979;107

  1. this output is written to influx v2 database.

Everthing was working without issue with telegraf v1.18. 1. After upgrading to telegraf 1.23.2 we were getting conversions errors. -> 2022-07-20T11:00:17Z E! [inputs.exec] Error in plugin: column type: parse float error strconv.ParseFloat: parsing "RUNTIME": invalid syntax

After further investingation we could narrow down this error. Last working telegraf version is 1.20.4. With telegraf 1.21.0 and any newer version we are getting this error. In the release notes of 1.21.0 we don't see any change which should affect inputs.exec, input.file or outputs.influxdb_v2.

One more interesting deatail which we faced during the investigation: After restart of telegraf service 1.21.0 it is working exactly one time, so one record is written to influx_v2 database. All further calls lead to errors "parse float error strconv".

Expected behavior

Every call of Powersehll script from inputs.exec with the latest telegraf version writes a record to influxdb_v2 without any converisons errors.

Actual behavior

First call of Powershell script after restart of telegraf 1.21.0 service from inputs.exec writes a dataset to influxdb_v2 without converisons error. All further call ending with conversion error.

Additional info

telegraf 1.21.0 was startet around 2022-07-20T10:52:00.000Z and at 10:56 one record was written to influxdb_v2. After that no more records were written an we see only conversion errors in the log.

influx data explorer output 0 exec_MSTR_PERF RUNTIME 7.6900282 2022-07-20T09:18:12.400Z 2022-07-20T12:18:12.400Z 2022-07-20T10:56:00.000Z WIKI-PROD-FRA TD

Error in log 2022-07-20T11:00:08Z E! [inputs.exec] Error in plugin: column type: parse float error strconv.ParseFloat: parsing "RUNTIME": invalid syntax

powersj commented 2 years ago

@srebhan

After restart of telegraf service 1.21.0 it is working exactly one time,

Is this the same as #10678?

kurtbeil01 commented 2 years ago

@powersj, thanks for this hint. Seems to be the same as #10678 or #10347

Got it working with follwing workaround!

1) Removed header line from csv Instead "SYSTEM";"RUNTIME";"ROWS" TD;23.6520979;107 we use now TD;23.6520979;107

2) Modified telegraf.conf , set csv_header_row_count to 0 and added csv_column_names Old csv_header_row_count = 1 New csv_header_row_count = 0 csv_column_names = ["SYSTEM","RUNTIME","ROWS"]

After those changes we no longer get the "parse float error strconv" error (tested with v 1.21.0 and 1.23.2)

So telegraf has a problem with the header line since version 1.21.0