influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.51k stars 5.55k forks source link

Nokia sensor broken after 1.15.4 #12504

Closed bofh16 closed 1 year ago

bofh16 commented 1 year ago

Relevant telegraf.conf

[[inputs.gnmi]]
        addresses = ["host:57400"]
        username = "user"
        password = "pass"
        encoding = "json"
        redial = "10s"

        [[inputs.gnmi.subscription]]
                name = "sros_chassis_stats"
                path = "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"
                subscription_mode = "sample"
                sample_interval = "30s"

Logs from Telegraf

2023-01-13T18:28:35Z E! [telegraf] Error running agent: starting input inputs.gnmi: invalid string path /state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]: invalid path element chassis[chassis-class=*]chassis-number[chassis-number=*]: malformed List key-value pair string: chassis-number[chassis-number=*], in: [chassis-class=*]chassis-number[chassis-number=*]

System info

Telegraf 1.25.0 on various distributions.

Docker

No response

Steps to reproduce

  1. Configure the input plugin as above.
  2. Restart the service.

Expected behavior

Telegraf to collect the sensor data. It works on 1.15.4.

Actual behavior

It doesn't work in 1.2[45]. Can't say for sure, where exactly it went wrong in between.

Additional info

The sensor path has been also verified with gNMI client from cli.

powersj commented 1 year ago

Looks like the change happened between v1.22.2 and v1.22.3:

❯ ../telegraf-builds/telegraf-v1.22.2 --config config.toml 
2023-01-13T19:18:50Z I! Starting Telegraf 1.22.2
2023-01-13T19:18:50Z I! Loaded inputs: gnmi
2023-01-13T19:18:50Z I! Loaded aggregators: 
2023-01-13T19:18:50Z I! Loaded processors: 
2023-01-13T19:18:50Z I! Loaded outputs: file
2023-01-13T19:18:50Z I! Tags enabled: host=ryzen
2023-01-13T19:18:50Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ryzen", Flush Interval:10s
2023-01-13T19:18:54Z E! [inputs.gnmi] Error in plugin: failed to setup subscription: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup host: no such host"
^C2023-01-13T19:18:58Z I! [agent] Hang on, flushing any cached metrics before shutdown
2023-01-13T19:18:58Z I! [agent] Stopping running outputs
telegraf on  master [?] via 🐹 v1.19.5 took 7s 
❯ ../telegraf-builds/telegraf-v1.22.3 --config config.toml 
2023-01-13T19:18:59Z I! Starting Telegraf 1.22.3
2023-01-13T19:18:59Z I! Loaded inputs: gnmi
2023-01-13T19:18:59Z I! Loaded aggregators: 
2023-01-13T19:18:59Z I! Loaded processors: 
2023-01-13T19:18:59Z I! Loaded outputs: file
2023-01-13T19:18:59Z I! Tags enabled: host=ryzen
2023-01-13T19:18:59Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ryzen", Flush Interval:10s
2023-01-13T19:18:59Z E! [telegraf] Error running agent: starting input inputs.gnmi: invalid string path /state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]: invalid path element chassis[chassis-class=*]chassis-number[chassis-number=*]: malformed List key-value pair string: chassis-number[chassis-number=*], in: [chassis-class=*]chassis-number[chassis-number=*]

It was a small bug fix release and the commit seems to be https://github.com/influxdata/telegraf/commit/83fac37182c5c7ccfa70d49695b8782cbb696f1e.

@bewing or @srebhan does something in the path not look right?

srebhan commented 1 year ago

Shouldn't this

path = "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"

be rather

path = "/state/chassis[chassis-class=*]/chassis-number[chassis-number=*]/fan[fan-slot=*]"

Note the slash before chassis-number? Otherwise I think this is an invalid path!?!?

bofh16 commented 1 year ago

@srebhan, looks good to me - multi-leaf key. Part of nokia-state-chassis.yang model, can be found on Nokia Git.

image

To reconfirm - tested successfully in cli with the path, posted initially. If "/" is added, the device responses with an error.

bofh16 commented 1 year ago

From another vendor documentation.

image

powersj commented 1 year ago

I can reproduce this outside of telegraf, just using the library. Will you please file an upstream issue and let us know the URL to the issue once you have?

Here is a reproducer:

package main

import (
    "github.com/google/gnxi/utils/xpath"
)

func main() {
    path := "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"
    _, err := xpath.ToGNMIPath(path)
    if err != nil {
        panic(err)
    }
}

Will produce the error as follows:

panic: invalid string path /state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]: invalid path element chassis[chassis-class=*]chassis-number[chassis-number=*]: malformed List key-value pair string: chassis-number[chassis-number=*], in: [chassis-class=*]chassis-number[chassis-number=*]

goroutine 1 [running]:
main.main()
    /tmp/test/main.go:11 +0x45
exit status 2
bofh16 commented 1 year ago

https://github.com/google/gnxi/issues/320

bewing commented 1 year ago

Based on these two comments, wouldn't the valid path(s) be /state/chassis[chassis-class=*][chassis-number=*]/chassis-class /state/chassis[chassis-class=*][chassis-number=*]/chassis-number /state/chassis[chassis-class=*][chassis-number=*]/fan[fan-slot=*] ?

package main

import (
        "fmt"
        "github.com/davecgh/go-spew/spew"
        "github.com/google/gnxi/utils/xpath"
        gnmiLib "github.com/openconfig/gnmi/proto/gnmi"
        "strings"
)

func main() {
        path := "/state/chassis[chassis-class=*][chassis-number=*]/fan[fan-slot=*]"
        xp, err := xpath.ToGNMIPath(path)
        if err != nil {
                panic(err)
        }
        spew.Dump(xp.Elem)
        path = "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"
        xp, err = parsePath("origin", path, "target")
        if err != nil {
                panic(err)
        }
        spew.Dump(xp.Elem)

}

// ParsePath from XPath-like string to gNMI path structure
func parsePath(origin string, pathToParse string, target string) (*gnmiLib.Path, error) {
        var err error
        gnmiPath := gnmiLib.Path{Origin: origin, Target: target}

// remaining original function omitted
$ go run ./main.go
([]*gnmi.PathElem) (len=3 cap=4) {
 (*gnmi.PathElem)(0xc0001c2b80)(name:"state"),
 (*gnmi.PathElem)(0xc0001c2bc0)(name:"chassis"  key:{key:"chassis-class"  value:"*"}  key:{key:"chassis-number"  value:"*"}),
 (*gnmi.PathElem)(0xc0001c2c00)(name:"fan"  key:{key:"fan-slot"  value:"*"})
}
([]*gnmi.PathElem) (len=3 cap=4) {
 (*gnmi.PathElem)(0xc0001c3200)(name:"state"),
 (*gnmi.PathElem)(0xc0001c3240)(name:"chassis"  key:{key:"chassis-class"  value:"*"}  key:{key:"chassis-number"  value:"*"}),
 (*gnmi.PathElem)(0xc0001c3280)(name:"fan"  key:{key:"fan-slot"  value:"*"})
}
bofh16 commented 1 year ago

Both are valid XPaths, accepted by the device. We could shorten the entry and save some disks space :sunglasses: Thanks, for noticing.

$ ~/bin/pygnmi/gNMI_Subscribe.py --server host:57400 --username x --password y "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"
23/01/20 09:11:52,657 Create insecure Channel
23/01/20 09:11:52,667 Sending SubscribeRequest
subscribe {
  subscription {
    path {
      elem {
        name: "state"
      }
      elem {
        name: "chassis"
        key {
          key: "chassis-class"
          value: "*"
        }
        key {
          key: "chassis-number"
          value: "*"
        }
      }
      elem {
        name: "fan"
        key {
          key: "fan-slot"
          value: "*"
        }
      }
    }
    mode: SAMPLE
    sample_interval: 10000000000
  }
}
$ ~/bin/pygnmi/gNMI_Subscribe.py --server host:57400 --username x --password y "/state/chassis[chassis-class=*][chassis-number=*]/fan[fan-slot=*]"
23/01/20 09:18:19,104 Create insecure Channel
23/01/20 09:18:19,117 Sending SubscribeRequest
subscribe {
  subscription {
    path {
      elem {
        name: "state"
      }
      elem {
        name: "chassis"
        key {
          key: "chassis-class"
          value: "*"
        }
        key {
          key: "chassis-number"
          value: "*"
        }
      }
      elem {
        name: "fan"
        key {
          key: "fan-slot"
          value: "*"
        }
      }
    }
    mode: SAMPLE
    sample_interval: 10000000000
  }
}
bewing commented 1 year ago

Both are valid XPaths, accepted by the device

I would argue that it is NOT a valid XPath (at least as far as GNMI is concerned), and that your Python script/library is converting an incorrectly formatted XPath string to a valid GNMI PathElement. The GNMI Path String Reference does not specifically state what to do with extra string characters after keys of a path element, but does appear to imply that correctly formatted string paths are elements, separated by slashes, and that each element is a name, followed by zero or more keys.

It also contains a link to a reference implementation string encoder. Parsing "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]" with it results in a path that probably even your device would reject:

package main

import (
        "github.com/davecgh/go-spew/spew"
        "github.com/openconfig/ygot/ygot"
)

func main() {
        path := "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"
        xp, err := ygot.StringToStructuredPath(path)
        if err != nil {
                panic(err)
        }
        spew.Dump(xp)
}
$ go run ./main.go
(*gnmi.Path)(0xc00017de00)(elem:{name:"state"} elem:{name:"chassis" key:{key:"chassis-class" value:"*"} key:{key:"chassis-numberchassis-number" value:"*"}} elem:{name:"fan" key:{key:"fan-slot" value:"*"}})

Perhaps the issue is that there is ambiguity in how strings are parsed back into gnmi Paths -- should the reference docs be updated to say that the only valid string representation of a gnmi.Path is the one outputted by the reference library, or something about inversing for an identity check?

package main

import (
        "github.com/davecgh/go-spew/spew"
        "github.com/openconfig/ygot/ygot"
        "log"
)

func main() {
        path := "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]"
        xp, err := ygot.StringToStructuredPath(path)
        if err != nil {
                panic(err)
        }
        spew.Dump(xp)
        path2, err := ygot.PathToString(xp)
        if err != nil {
                panic(err)
        }
        spew.Dump(path2)
        if path != path2 {
                log.Fatalf("gnmi string paths not equal: %v %v", path, path2)
        }
}
$ go run ./main.go
(*gnmi.Path)(0xc00017de00)(elem:{name:"state"} elem:{name:"chassis" key:{key:"chassis-class" value:"*"} key:{key:"chassis-numberchassis-number" value:"*"}} elem:{name:"fan" key:{key:"fan-slot" value:"*"}})
(string) (len=79) "/state/chassis[chassis-class=*][chassis-numberchassis-number=*]/fan[fan-slot=*]"
2023/01/20 06:50:49 gnmi string paths not equal: /state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*] /state/chassis[chassis-class=*][chassis-numberchassis-number=*]/fan[fan-slot=*]
exit status 1

The argument against, of course, is "Be conservative in what you send, and liberal in what you accept", but this is in reference to an outside library now that my commit removed the internal path parser in favor of Google's.

bofh16 commented 1 year ago

To be frank, I'm confused of the fact, there are the multiple GO examples, which - on one side, include XPath without "/" successful validation, but on the other side, this path is considered invalid.

My understanding so far:

image

If the internal validation was replaced by Google one, the problem is with Google library. I've opened an upstream ticket, as requested. Please, advise what is next.

srebhan commented 1 year ago

@bofh16 this "/state/chassis[chassis-class=*][chassis-number=*]/fan[fan-slot=*]" is a valid gNMI path and it IS accepted by the gnmi plugin at least in current master. Your issue arises due to the fact that the previous custom-made parser (wrongly?) accepted "/state/chassis[chassis-class=*]chassis-number[chassis-number=*]/fan[fan-slot=*]" but we switched to an standard library which does not.

Does that solve your confusion?

srebhan commented 1 year ago

The spec you show does not say that the extra chassis-number after the chassis-class key is ok. The fact that the device accepts both doesn't say anything about the spec. ;-)

bofh16 commented 1 year ago

OK, now it is clear. Sorry, for the confusion, but the suggestion to add "/" between the keys, was the one that complicated the discussion for me. Thanks, for your help - appreciated it.

srebhan commented 1 year ago

Sorry for causing confusion @bofh16!