perfsonar / nagios

Nagios checks for perfSONAR services
Apache License 2.0
2 stars 0 forks source link

traceroute check fails to identify lack of connectivity case #14

Open igarny opened 6 years ago

igarny commented 6 years ago

Together with Internet2 GEANT has a project on L2 connectivity performance monitoring.

On a case, where the L2 circuit got broken, our traceroute monitoring indicates operational. The problem stems from the fact, that traceroute does return a response at the end, which most likely misleads the traceroute check. Please check the diagnostic output below.

IMHO there is no urgency, since it is unlikely someone will rely exclusively on traceroute monitoring and would not be able to detect the lack of L2 connectivity. Still with the development and support of virtual interfaces, this issue can become more and more common.

Here is some diagnostic output (in cooperation with John Hicks) host10.9.2.5: ~]$ traceroute 10.9.2.2 traceroute to 10.9.2.2 (10.9.2.2), 30 hops max, 60 byte packets 1 10.9.2.5 (10.9.2.5) 3000.613 ms !H 3000.608 ms !H 3000.602 ms !H

host10.9.2.5: ~]$ ping 10.9.2.2 PING 10.9.2.2 (10.9.2.2) 56(84) bytes of data. From 10.9.2.5 icmp_seq=2 Destination Host Unreachable From 10.9.2.5 icmp_seq=3 Destination Host Unreachable From 10.9.2.5 icmp_seq=4 Destination Host Unreachable