tynany / frr_exporter

Prometheus exporter for Free Range Routing
MIT License
100 stars 34 forks source link

Collector using `vtysh` for BFD metrics instead of `bfdd.sock` #100

Closed SRv6d closed 1 year ago

SRv6d commented 1 year ago

We are running frr_exporter using the tynany/frr_exporter:v1.1.4 docker image with the arguments

--web.listen-address=<removed>:9342
--frr.socket.dir-path=/frr_sockets
--no-collector.ospf
--collector.bgp.peer-descriptions
--collector.bgp.peer-descriptions.plain-text

and a /var/run/frr:/frr_sockets:rw volume, exposing all daemons through /frr_sockets.

Running the container, we repeatedly get the following error:

ts=2023-04-17T10:41:00.837Z caller=collector.go:126 level=error msg="collector scrape failed" name=bfd duration_seconds=0.133992136 err="command /usr/bin/vtysh -c show bfd peers json failed: exit status 1: stderr: % Can't open configuration file /etc/frr/vtysh.conf due to 'No such file or directory'. Exiting: failed to connect to any daemons. : stdout: "

It seems as if the exporter is trying to gather BFD metrics by parsing /usr/bin/vtysh output, instead of interfacing with bfdd.sock in /frr_sockets. Since vtysh cannot access /etc/frr/vtysh.conf within the container, it throws an error. According to the documentation, I would expect the exporter not to use vtysh for BFD metrics with our configuration but to interface with the respective daemon directly, as it does with BGP.

dswarbrick commented 1 year ago

It appears to be a TODO.

func executeBFDCommand(cmd string) ([]byte, error) {
    // to do: work out how to interact with the bfdd.vty Unix socket:
    // % [BFD] Unknown command: show bfd peers json
    return execVtyshCommand(cmd)
}
SRv6d commented 1 year ago

Does bfdd require a different implementation than the other daemons ? The todo is ~16 months back, if bfdd cannot be supported in the near future using a direct socket having this somewhere in the documentation would be helpful.

dswarbrick commented 1 year ago

It appears that the reason for the confusion is that show bfd peers is a "read-write" command. It works with the vtysh method because that starts a privileged session by default, e.g.:

xps15# show bfd peers
BFD Peers:
        peer 192.0.2.20 vrf default
                ID: 701498370
                Remote ID: 0
                Active mode
                Status: down
...

However, if we deliberately drop out of enable mode, the command returns the "unknown command" that was annotated in the executeBFDCommand function:

xps15# disable
xps15> show bfd peers
% Unknown command: show bfd peers

Since frr_exporter does not send an enable command to the sockets before the desired command, like vtysh does, I guess this explains why bfdd socket support was never implemented.

A large majority of status commands don't require to be in enable mode, hence why frr_exporter can get away with it for most functionality.

tynany commented 1 year ago

FRR Exporter now scrapes BFD metrics from the bfdd.vty socket by default in release v1.2.0 thanks to @dswarbrick.