influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.63k stars 5.58k forks source link

inputs.s7comm bit parsing #14043

Closed phagemann closed 11 months ago

phagemann commented 1 year ago

Relevant telegraf.conf

[[inputs.s7comm]]
  server = "10.0.0.1:102"
  rack = 0
  slot = 2

  ## Metric definition(s)
  [[inputs.s7comm.metric]]
    name = "Sinumerik"

    fields = [  
      {address= "DB1050.X0.1",    name= "RedGrindingActive"     },
      {address= "DB1050.X2.1",    name= "BlueGrindingActive"    },

      {address= "DB21.X35.5",       name= "NC running"           },
      {address= "DB21.X36.6",       name= "Channnel Alarm active"},

      {address= "PE32.X32.1",  name= "Controll On"      },
      {address= "PE34.X34.2",  name= "Emergency Stop"   },
      {address= "PE32.X32.7",  name= "Automatic active" },
      {address= "PE2.X2.7",   name= "KeyPos0-Auto"     },
      {address= "PE2.X2.6",   name= "KeyPos1"          },
      {address= "PE3.X3.6",   name= "KeyPos2"          },
      {address= "PE4.X4.4",   name= "KeyPos3-Setup"    },

    ]

    ## Tags assigned to the metric
    [inputs.s7comm.metric.tags]
      device = "Sinumerik"

[[outputs.discard]]

Logs from Telegraf

2023-10-03T09:28:28Z I! Loading config: telegraf.conf
2023-10-03T09:28:28Z I! Starting Telegraf 1.29.0-5eaabde9 brought to you by InfluxData the makers of InfluxDB
2023-10-03T09:28:28Z I! Available plugins: 15 inputs, 9 aggregators, 29 processors, 24 parsers, 4 outputs, 5 secret-stores
2023-10-03T09:28:28Z I! Loaded inputs: s7comm
2023-10-03T09:28:28Z I! Loaded aggregators: 
2023-10-03T09:28:28Z I! Loaded processors: 
2023-10-03T09:28:28Z I! Loaded secretstores: 
2023-10-03T09:28:28Z I! Loaded outputs: discard
2023-10-03T09:28:28Z I! Tags enabled: host=XXXXX
2023-10-03T09:28:28Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"XXXXX", Flush Interval:10s
2023-10-03T09:28:28Z D! [agent] Initializing plugins
2023-10-03T09:28:28Z D! [agent] Connecting outputs
2023-10-03T09:28:28Z D! [agent] Attempting connection to [outputs.discard]
2023-10-03T09:28:28Z D! [agent] Successfully connected to outputs.discard
2023-10-03T09:28:28Z D! [agent] Starting service inputs
2023-10-03T09:28:30Z D! [inputs.s7comm] Reading batch 1...
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [1] for field "RedGrindingActive" @ 0 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "BlueGrindingActive" @ 2 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [1] for field "NC running" @ 35 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [1] for field "Channnel Alarm active" @ 36 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "Controll On" @ 32 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "Emergency Stop" @ 34 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "Automatic active" @ 32 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "KeyPos0-Auto" @ 2 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "KeyPos1" @ 2 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [0] for field "KeyPos2" @ 3 --> false (bool)
2023-10-03T09:28:30Z D! [inputs.s7comm]   got [1] for field "KeyPos3-Setup" @ 4 --> false (bool)
^C2023-10-03T09:28:34Z D! [agent] Stopping service inputs
2023-10-03T09:28:34Z D! [agent] Input channel closed
2023-10-03T09:28:34Z I! [agent] Hang on, flushing any cached metrics before shutdown
2023-10-03T09:28:34Z D! [outputs.discard] Wrote batch of 1 metrics in 571ns
2023-10-03T09:28:34Z D! [outputs.discard] Buffer fullness: 0 / 10000 metrics
2023-10-03T09:28:34Z I! [agent] Stopping running outputs
2023-10-03T09:28:34Z D! [agent] Stopped Successfully

System info

Telegraf 1.29, Ubuntu 22.04 LTS, S7 CPU 317-3PN/DP

Docker

No response

Steps to reproduce

  1. Run Telegraf 1.29 with supplied config against S7 CPU 317-3PN/DP

Expected behavior

Values to be parsed correctly as indicated in Debug log. See Actual behavior

Actual behavior

got [1] for field "NC running" @ 35 --> false (bool)

Value being parsed to false independently of actual value indicate by byte [1]

Additional info

Change parsing function in 'telegraf/plugins/inputs/s7comm/type_conversion.go' Line 16 to

case "X":
    return func(buf []byte) interface{} {
        return buf[0] != 0
    }

Might be a local problem. @srebhan Could you please explain the original function (buf[0] & (1 << extra)) != 0 so I don't miss something?

phagemann commented 1 year ago

The issue seems to be caused by gos7 itself. When querying a single bool in a batch, gos7 builds a different address opposed to a byte. Provided the following S7DataItemStructs

dataItems := []gos7.S7DataItem{
    {s7areape, 0x01, 0, 32, 1, make([]byte, 2), ""},
    {s7areape, 0x02, 0, 32, 1, make([]byte, 2), ""},
}

I would expect gos7 to query address 32. But instead it queries the following (inspected by wireshark): image

The issue seems to be https://github.com/robinson/gos7/blob/ebce1a99aa32e62adad61e5dffc5a8e316cd5f18/multi.go#L165-L169 Where bit, counter and timer addresses are parsed different. For bits this seems to be wrong, counters and timers I can't test as of right now. Based on the screenshot above bits addresses should be parsed like the other types. As there is no way to encode the bit address in S7Comm struct I suggest to query bit addresses as bytes https://github.com/influxdata/telegraf/blob/f4c56e1597956cd95eb37336e0c08ae361338aba/plugins/inputs/s7comm/s7comm.go#L43 and decode it to bit in
https://github.com/influxdata/telegraf/blob/f4c56e1597956cd95eb37336e0c08ae361338aba/plugins/inputs/s7comm/type_conversions.go#L16

Although this introduces some inefficiencies (querying the same byte-address multiple times for different bits), from my point of view this seems to be the best solution, as otherwise significant changes to gos7 would be required.

Would be happy to provide a PR after some discussion =)

powersj commented 1 year ago

The issue seems to be caused by gos7 itself.

I would not want to take a temporary fix that would need to get reverted or changed, when the upstream has been responsive to PRs.

Can you please put in a fix to the upstream library and we would be happy to update to the latest version of it.

Thanks!