tynany / frr_exporter

Prometheus exporter for Free Range Routing
MIT License
100 stars 34 forks source link

cannot process output of show bgp vrf all ipv4 unicast summary json: unexpected end of JSON input #79

Closed 33Fraise33 closed 2 years ago

33Fraise33 commented 2 years ago

Hello,

We upgraded towards version 1.1.2 but we are seeing the following issue below: msg="collector scrape failed" name=bgp duration_seconds=0.058906413 err="cannot process output of show bgp vrf all ipv4 unicast summary json: unexpected end of JSON input: command output: {\n\"default\":{\n \"routerId\":\"10.11.193.2\",\n \"as\":65237,\n \"vrfId\":0,\n \"vrfName\":\"default\",\n \"tableVersion\":2013,\n \"ribCount\":113,\n \"ribMemory\":22600,\n \"peerCount\":10,\n \"peerMemory\":233760,\n \"peerGroupCount\":4,\n \"peerGroupMemory\":256,\n \"peers\":{\n \"10.11.192.45\":{\n \"hostname\":\"borax\",\n \"remoteAs\":65224,\n \"version\":4,\n \"msgRcvd\":15495697,\n \"msgSent\":16552925,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"04:14:21\",\n \"peerUptimeMsec\":15261000,\n \"peerUptimeEstablishedEpoch\":1661762601,\n \"pfxRcd\":2,\n \"pfxSnt\":10,\n \"state\":\"Established\",\n \"connectionsEstablished\":3,\n \"connectionsDropped\":2,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.53\":{\n \"hostname\":\"bruno\",\n \"remoteAs\":65225,\n \"version\":4,\n \"msgRcvd\":146401702,\n \"msgSent\":128100078,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"14w6d15h\",\n \"peerUptimeMsec\":9043150000,\n \"peerUptimeEstablishedEpoch\":1652734712,\n \"pfxRcd\":5,\n \"pfxSnt\":10,\n \"state\":\"Established\",\n \"connectionsEstablished\":1,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.61\":{\n \"hostname\":\"julius\",\n \"remoteAs\":65226,\n \"version\":4,\n \"msgRcvd\":168398736,\n \"msgSent\":128074402,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"14w0d00h\",\n \"peerUptimeMsec\":8468008000,\n \"peerUptimeEstablishedEpoch\":1653309854,\n \"pfxRcd\":5,\n \"pfxSnt\":10,\n \"state\":\"Established\",\n \"connectionsEstablished\":2,\n \"connectionsDropped\":1,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.69\":{\n \"remoteAs\":65242,\n \"version\":4,\n \"msgRcvd\":3444661,\n \"msgSent\":3015008,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"14w6d16h\",\n \"peerUptimeMsec\":9043336000,\n \"peerUptimeEstablishedEpoch\":1652734526,\n \"pfxRcd\":1,\n \"pfxSnt\":5,\n \"state\":\"Established\",\n \"connectionsEstablished\":1,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.5\":{\n \"hostname\":\"mewtwo\",\n \"remoteAs\":65001,\n \"version\":4,\n \"msgRcvd\":3013708,\n \"msgSent\":3013717,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"05:29:23\",\n \"peerUptimeMsec\":19763000,\n \"peerUptimeEstablishedEpoch\":1661758099,\n \"pfxRcd\":5,\n \"pfxSnt\":7,\n \"state\":\"Established\",\n \"connectionsEstablished\":48,\n \"connectionsDropped\":47,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.21\":{\n \"remoteAs\":65506,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"swp48\":{\n \"hostname\":\"DC-IX-CORESW01\",\n \"remoteAs\":65234,\n \"version\":4,\n \"msgRcvd\":131145939,\n \"msgSent\":128100948,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"01w4d15h\",\n \"peerUptimeMsec\":1007473000,\n \"peerUptimeEstablishedEpoch\":1660770389,\n \"pfxRcd\":49,\n \"pfxSnt\":56,\n \"state\":\"Established\",\n \"connectionsEstablished\":2,\n \"connectionsDropped\":1,\n \"idType\":\"interface\"\n },\n \"swp53\":{\n \"hostname\":\"DC-LCL-CORESW01\",\n \"remoteAs\":65236,\n \"version\":4,\n \"msgRcvd\":155333619,\n \"msgSent\":128104743,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"04w0d03h\",\n \"peerUptimeMsec\":2432273000,\n \"peerUptimeEstablishedEpoch\":1659345589,\n \"pfxRcd\":43,\n \"pfxSnt\":56,\n \"state\":\"Established\",\n \"connectionsEstablished\":3,\n \"connectionsDropped\":2,\n \"idType\":\"interface\"\n },\n \"swp54\":{\n \"hostname\":\"DC-LCL-CORESW01\",\n \"remoteAs\":65236,\n \"version\":4,\n \"msgRcvd\":155333620,\n \"msgSent\":128105060,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"04w0d03h\",\n \"peerUptimeMsec\":2432274000,\n \"peerUptimeEstablishedEpoch\":1659345588,\n \"pfxRcd\":43,\n \"pfxSnt\":56,\n \"state\":\"Established\",\n \"connectionsEstablished\":3,\n \"connectionsDropped\":2,\n \"idType\":\"interface\"\n },\n \"swp56\":{\n \"hostname\":\"DC-IX-CORESW02\",\n \"remoteAs\":65235,\n \"version\":4,\n \"msgRcvd\":17468827,\n \"msgSent\":16416336,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"01w4d16h\",\n \"peerUptimeMsec\":1008055000,\n \"peerUptimeEstablishedEpoch\":1660769807,\n \"pfxRcd\":41,\n \"pfxSnt\":56,\n \"state\":\"Established\",\n \"connectionsEstablished\":1,\n \"connectionsDropped\":0,\n \"idType\":\"interface\"\n }\n },\n \"failedPeers\":1,\n \"totalPeers\":10,\n \"dynamicPeers\":0,\n \"bestPath\":{\n \"multiPathRelax\":\"false\"\n }\n}\n,\n\"nni\":{\n \"routerId\":\"10.11.194.2\",\n \"as\":65267,\n \"vrfId\":194,\n \"vrfName\":\"nni\",\n \"tableVersion\":143706,\n \"ribCount\":541,\n \"ribMemory\":108200,\n \"peerCount\":34,\n \"peerMemory\":794784,\n \"peerGroupCount\":4,\n \"peerGroupMemory\":256,\n \"peers\":{\n \"172.31.255.213\":{\n \"remoteAs\":65100,\n \"version\":4,\n \"msgRcvd\":83516,\n \"msgSent\":80581,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"03w6d23h\",\n \"peerUptimeMsec\":2416363000,\n \"peerUptimeEstablishedEpoch\":1659361499,\n \"pfxRcd\":2,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":1,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.13\":{\n \"hostname\":\"mewtwo\",\n \"remoteAs\":65002,\n \"version\":4,\n \"msgRcvd\":3029477,\n \"msgSent\":3053105,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"05:29:21\",\n \"peerUptimeMsec\":19761000,\n \"peerUptimeEstablishedEpoch\":1661758101,\n \"pfxRcd\":39,\n \"pfxSnt\":272,\n \"state\":\"Established\",\n \"connectionsEstablished\":254,\n \"connectionsDropped\":253,\n \"idType\":\"ipv4\"\n },\n \"10.11.192.29\":{\n \"remoteAs\":65506,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.3\":{\n \"remoteAs\":64831,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.11\":{\n \"remoteAs\":64832,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.18\":{\n \"remoteAs\":64835,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.26\":{\n \"remoteAs\":64836,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.34\":{\n \"remoteAs\":64833,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.42\":{\n \"remoteAs\":64834,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.50\":{\n \"remoteAs\":64801,\n \"version\":4,\n \"msgRcvd\":10257749,\n \"msgSent\":8978507,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"17:59:20\",\n \"peerUptimeMsec\":64760000,\n \"peerUptimeEstablishedEpoch\":1661713102,\n \"pfxRcd\":0,\n \"state\":\"Connect\",\n \"connectionsEstablished\":7,\n \"connectionsDropped\":7,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.58\":{\n \"remoteAs\":64803,\n \"version\":4,\n \"msgRcvd\":10330891,\n \"msgSent\":9043125,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"3d15h44m\",\n \"peerUptimeMsec\":315851000,\n \"peerUptimeEstablishedEpoch\":1661462011,\n \"pfxRcd\":9,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":18,\n \"connectionsDropped\":17,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.66\":{\n \"remoteAs\":64805,\n \"version\":4,\n \"msgRcvd\":10197271,\n \"msgSent\":8925883,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"1d21h33m\",\n \"peerUptimeMsec\":163995000,\n \"peerUptimeEstablishedEpoch\":1661613867,\n \"pfxRcd\":1,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":30,\n \"connectionsDropped\":29,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.74\":{\n \"remoteAs\":64807,\n \"version\":4,\n \"msgRcvd\":10332021,\n \"msgSent\":9043276,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"3d15h44m\",\n \"peerUptimeMsec\":315851000,\n \"peerUptimeEstablishedEpoch\":1661462011,\n \"pfxRcd\":7,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":17,\n \"connectionsDropped\":16,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.82\":{\n \"remoteAs\":64809,\n \"version\":4,\n \"msgRcvd\":10329557,\n \"msgSent\":9041493,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"3d06h51m\",\n \"peerUptimeMsec\":283912000,\n \"peerUptimeEstablishedEpoch\":1661493950,\n \"pfxRcd\":4,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":9,\n \"connectionsDropped\":8,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.90\":{\n \"remoteAs\":64811,\n \"version\":4,\n \"msgRcvd\":10320810,\n \"msgSent\":9032954,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"1d21h33m\",\n \"peerUptimeMsec\":163991000,\n \"peerUptimeEstablishedEpoch\":1661613871,\n \"pfxRcd\":1,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":82,\n \"connectionsDropped\":81,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.98\":{\n \"remoteAs\":64812,\n \"version\":4,\n \"msgRcvd\":3441595,\n \"msgSent\":3012163,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"4d03h35m\",\n \"peerUptimeMsec\":358535000,\n \"peerUptimeEstablishedEpoch\":1661419327,\n \"pfxRcd\":3,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":12,\n \"connectionsDropped\":11,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.106\":{\n \"remoteAs\":64813,\n \"version\":4,\n \"msgRcvd\":3443567,\n \"msgSent\":3013639,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"06w4d06h\",\n \"peerUptimeMsec\":3997488000,\n \"peerUptimeEstablishedEpoch\":1657780374,\n \"pfxRcd\":3,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":4,\n \"connectionsDropped\":3,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.114\":{\n \"remoteAs\":64814,\n \"version\":4,\n \"msgRcvd\":10297446,\n \"msgSent\":9013735,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"4d15h28m\",\n \"peerUptimeMsec\":401290000,\n \"peerUptimeEstablishedEpoch\":1661376572,\n \"pfxRcd\":4,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":12,\n \"connectionsDropped\":11,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.122\":{\n \"remoteAs\":64815,\n \"version\":4,\n \"msgRcvd\":10331908,\n \"msgSent\":9043123,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"03w6d14h\",\n \"peerUptimeMsec\":2385921000,\n \"peerUptimeEstablishedEpoch\":1659391941,\n \"pfxRcd\":2,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":5,\n \"connectionsDropped\":4,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.130\":{\n \"remoteAs\":64816,\n \"version\":4,\n \"msgRcvd\":10332045,\n \"msgSent\":9042964,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"02w6d06h\",\n \"peerUptimeMsec\":1749628000,\n \"peerUptimeEstablishedEpoch\":1660028234,\n \"pfxRcd\":3,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":9,\n \"connectionsDropped\":8,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.138\":{\n \"remoteAs\":64817,\n \"version\":4,\n \"msgRcvd\":10330314,\n \"msgSent\":9042215,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"02w0d00h\",\n \"peerUptimeMsec\":1210740000,\n \"peerUptimeEstablishedEpoch\":1660567122,\n \"pfxRcd\":8,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":43,\n \"connectionsDropped\":42,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.146\":{\n \"remoteAs\":64818,\n \"version\":4,\n \"msgRcvd\":3247973,\n \"msgSent\":3014469,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"3d15h44m\",\n \"peerUptimeMsec\":315855000,\n \"peerUptimeEstablishedEpoch\":1661462007,\n \"pfxRcd\":14,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":5,\n \"connectionsDropped\":4,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.154\":{\n \"remoteAs\":64819,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.162\":{\n \"remoteAs\":64820,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.170\":{\n \"remoteAs\":64821,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.178\":{\n \"remoteAs\":64822,\n \"version\":4,\n \"msgRcvd\":3444111,\n \"msgSent\":3014256,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"02w0d00h\",\n \"peerUptimeMsec\":1210740000,\n \"peerUptimeEstablishedEpoch\":1660567122,\n \"pfxRcd\":2,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":39,\n \"connectionsDropped\":38,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.186\":{\n \"remoteAs\":64837,\n \"version\":4,\n \"msgRcvd\":10332039,\n \"msgSent\":9043230,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"02w4d22h\",\n \"peerUptimeMsec\":1636077000,\n \"peerUptimeEstablishedEpoch\":1660141785,\n \"pfxRcd\":2,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":29,\n \"connectionsDropped\":28,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.194\":{\n \"remoteAs\":64802,\n \"version\":4,\n \"msgRcvd\":0,\n \"msgSent\":0,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"never\",\n \"peerUptimeMsec\":0,\n \"pfxRcd\":0,\n \"state\":\"Idle (Admin)\",\n \"connectionsEstablished\":0,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.202\":{\n \"remoteAs\":64804,\n \"version\":4,\n \"msgRcvd\":3390967,\n \"msgSent\":3014463,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"5d00h53m\",\n \"peerUptimeMsec\":435197000,\n \"peerUptimeEstablishedEpoch\":1661342665,\n \"pfxRcd\":2,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":5,\n \"connectionsDropped\":4,\n \"idType\":\"ipv4\"\n },\n \"10.189.100.218\":{\n \"remoteAs\":65041,\n \"version\":4,\n \"msgRcvd\":1973897,\n \"msgSent\":1816990,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"09w0d02h\",\n \"peerUptimeMsec\":5450717000,\n \"peerUptimeEstablishedEpoch\":1656327145,\n \"pfxRcd\":113,\n \"pfxSnt\":1,\n \"state\":\"Established\",\n \"connectionsEstablished\":1,\n \"connectionsDropped\":0,\n \"idType\":\"ipv4\"\n },\n \"swp48.1999\":{\n \"hostname\":\"DC-IX-CORESW01\",\n \"remoteAs\":65264,\n \"version\":4,\n \"msgRcvd\":3038670,\n \"msgSent\":3051910,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"01w4d15h\",\n \"peerUptimeMsec\":1007042000,\n \"peerUptimeEstablishedEpoch\":1660770820,\n \"pfxRcd\":197,\n \"pfxSnt\":273,\n \"state\":\"Established\",\n \"connectionsEstablished\":2,\n \"connectionsDropped\":1,\n \"idType\":\"interface\"\n },\n \"swp53.1999\":{\n \"hostname\":\"DC-LCL-CORESW01\",\n \"remoteAs\":65266,\n \"version\":4,\n \"msgRcvd\":3054316,\n \"msgSent\":3052182,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"04w0d03h\",\n \"peerUptimeMsec\":2432273000,\n \"peerUptimeEstablishedEpoch\":1659345589,\n \"pfxRcd\":51,\n \"pfxSnt\":273,\n \"state\":\"Established\",\n \"connectionsEstablished\":3,\n \"connectionsDropped\":2,\n \"idType\":\"interface\"\n },\n \"swp54.1999\":{\n \"hostname\":\"DC-LCL-CORESW01\",\n \"remoteAs\":65266,\n \"version\":4,\n \"msgRcvd\":3054356,\n \"msgSent\":3052187,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"04w0d03h\",\n \"peerUptimeMsec\":2432274000,\n \"peerUptimeEstablishedEpoch\":1659345588,\n \"pfxRcd\":51,\n \"pfxSnt\":273,\n \"state\":\"Established\",\n \"connectionsEstablished\":3,\n \"connectionsDropped\":2,\n \"idType\":\"interface\"\n },\n \"swp56.1999\":{\n \"hostname\":\"DC-IX-CORESW02\",\n \"remoteAs\":65265,\n \"version\":4,\n \"msgRcvd\":338775,\n \"msgSent\":339646,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"01w4d16h\",\n \"peerUptimeMsec\":1008054000,\n \"peerUptimeEstablishedEpoch\":1660769808,\n \"pfxRcd\":219,\n \"pfxSnt\":273,\n \"state\":\"Established\",\n \"connectionsEstablished\":1,\n \"connectionsDropped\":0,\n \"idType\":\"interface\"\n }\n },\n \"failedPeers\":12,\n \"totalPeers\":34,\n \"dynamicPeers\":0,\n \"bestPath\":{\n \"multiPathRelax\":\"false\"\n }\n}\n}\n"

frr_exporter service file:

cat /etc/systemd/system/frr_exporter@.service 
[Unit]
Description=Prometheus FRR Exporter
After=network-online.target

[Service]
Type=simple
User=root
Group=root
#ExecStart=/usr/local/bin/frr_exporter
ExecStart=/usr/local/bin/frr_exporter --frr.socket.dir-path="/var/run/frr" --no-collector.ospf --no-collector.bfd --collector.bgp6 --collector.bgpl2vpn --collector.bgp.peer-descriptions --collector.bgp.peer-descriptions.plain-text
SyslogIdentifier=frr_exporter
Restart=always
RestartSec=1
StartLimitInterval=0

[Install]
WantedBy=multi-user.target

FRR Version (cumulus linux 4.4.4)

FRRouting 7.5+cl4.4.0u4 (DC-LCL-CORESW02).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--enable-exampledir=/usr/share/doc/frr/examples/' '--localstatedir=/var/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--disable-dev-build' '--enable-systemd=yes' '--enable-rpki' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-zeromq' '--enable-ospfapi' '--disable-bgp-vnc' '--enable-multipath=128' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' '--disable-address-sanitizer' '--enable-protobuf' '--enable-cumulus=yes' '--enable-csmgr=yes' '--enable-datacenter=yes' '--enable-bfdd=no' '--enable-sharpd=yes' '--enable-grpc=yes' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'
33Fraise33 commented 2 years ago

I have the same issue on a clean debian host with FRR 8.3

cannot process output of show bgp vrf all ipv4 unicast summary json: unexpected end of JSON input: command output: {\n\"default\":{}\n,\n\"remotesites\":{\n \"routerId\":\"10.10.33.1\",\n \"as\":211184,\n \"vrfId\":12,\n \"vrfName\":\"remotesites\",\n \"tableVersion\":42,\n \"ribCount\":23,\n \"ribMemory\":4416,\n \"peerCount\":5,\n \"peerMemory\":3704160,\n \"peerGroupCount\":1,\n \"peerGroupMemory\":64,\n \"peers\":{\n \"10.10.34.2\":{\n \"remoteAs\":65501,\n \"localAs\":211184,\n \"version\":4,\n \"msgRcvd\":86713,\n \"msgSent\":86768,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"2d09h19m\",\n \"peerUptimeMsec\":206370000,\n \"peerUptimeEstablishedEpoch\":1661659372,\n \"pfxRcd\":3,\n \"pfxSnt\":11,\n \"state\":\"Established\",\n \"peerState\":\"OK\",\n \"connectionsEstablished\":3,\n \"connectionsDropped\":2,\n \"desc\":\"ASGARD\",\n \"idType\":\"ipv4\"\n },\n \"10.10.34.6\":{\n \"remoteAs\":65511,\n \"localAs\":211184,\n \"version\":4,\n \"msgRcvd\":86713,\n \"msgSent\":86725,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"14:53:08\",\n \"peerUptimeMsec\":53588000,\n \"peerUptimeEstablishedEpoch\":1661812154,\n \"pfxRcd\":1,\n \"pfxSnt\":3,\n \"state\":\"Established\",\n \"peerState\":\"OK\",\n \"connectionsEstablished\":6,\n \"connectionsDropped\":5,\n \"desc\":\"MOMv4\",\n \"idType\":\"ipv4\"\n },\n \"10.10.34.10\":{\n \"remoteAs\":65502,\n \"localAs\":211184,\n \"version\":4,\n \"msgRcvd\":78961,\n \"msgSent\":78995,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"4d22h42m\",\n \"peerUptimeMsec\":427370000,\n \"peerUptimeEstablishedEpoch\":1661438372,\n \"pfxRcd\":0,\n \"pfxSnt\":3,\n \"state\":\"Established\",\n \"peerState\":\"OK\",\n \"connectionsEstablished\":6,\n \"connectionsDropped\":5,\n \"desc\":\"SAKAARv4\",\n \"idType\":\"ipv4\"\n },\n \"10.10.34.14\":{\n \"remoteAs\":65512,\n \"localAs\":211184,\n \"version\":4,\n \"msgRcvd\":86726,\n \"msgSent\":86738,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"14:53:11\",\n \"peerUptimeMsec\":53591000,\n \"peerUptimeEstablishedEpoch\":1661812151,\n \"pfxRcd\":1,\n \"pfxSnt\":3,\n \"state\":\"Established\",\n \"peerState\":\"OK\",\n \"connectionsEstablished\":5,\n \"connectionsDropped\":4,\n \"desc\":\"DADv4\",\n \"idType\":\"ipv4\"\n },\n \"10.10.34.18\":{\n \"remoteAs\":65503,\n \"localAs\":211184,\n \"version\":4,\n \"msgRcvd\":72314,\n \"msgSent\":72346,\n \"tableVersion\":0,\n \"outq\":0,\n \"inq\":0,\n \"peerUptime\":\"1d19h52m\",\n \"peerUptimeMsec\":157948000,\n \"peerUptimeEstablishedEpoch\":1661707794,\n \"pfxRcd\":0,\n \"pfxSnt\":0,\n \"state\":\"Connect\",\n \"peerState\":\"OK\",\n \"connectionsEstablished\":7,\n \"connectionsDropped\":7,\n \"desc\":\"MIDGUARDv4\",\n \"idType\":\"ipv4\"\n }\n },\n \"failedPeers\":1,\n \"displayedPeers\":5,\n \"totalPeers\":5,\n \"dynamicPeers\":0,\n \"bestPath\":{\n \"multiPathRelax\":\"false\"\n }\n}\n,\n\"locix\":{}\n,\n\"kleyrex\":{}\n,\n\"fogixp\":{}\n}\n

tynany commented 2 years ago

Thanks for the detailed repro steps. Can you please paste the linted config that exhibits the issue?

33Fraise33 commented 2 years ago
!
frr version 8.3.1
frr defaults traditional
hostname kingpin
service integrated-vtysh-config
!
ipv6 route ::/0 2a0c:9a40:1::1 10
!
router bgp 211184
 bgp router-id 193.148.249.106
 no bgp default ipv4-unicast
 no bgp client-to-client reflection
 bgp graceful-restart
 no bgp network import-check
 neighbor default peer-group
 neighbor default timers 20 60
 neighbor 2a0c:9a40:1::1 remote-as 34927
 neighbor 2a0c:9a40:1::1 peer-group default
 neighbor 2a0c:9a40:1::1 description IFOG PEERING
 !
 address-family ipv4 unicast
  redistribute connected
  redistribute static
  neighbor default soft-reconfiguration inbound
  neighbor default route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor default route-map rm-AS211184-exact out
  import vrf remotesites
 exit-address-family
 !
 address-family ipv6 unicast
  aggregate-address 2a12:4946:4000::/44
  redistribute connected
  redistribute static
  neighbor default activate
  neighbor default soft-reconfiguration inbound
  neighbor default route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor default route-map rm-AS211184-exact out
  import vrf locix
  import vrf kleyrex
  import vrf remotesites
 exit-address-family
exit
!
router bgp 211184 vrf remotesites
 bgp router-id 10.10.33.1
 no bgp default ipv4-unicast
 no bgp client-to-client reflection
 no bgp network import-check
 neighbor remotesites peer-group
 neighbor remotesites timers 20 60
 neighbor 10.10.34.2 remote-as 65501
 neighbor 10.10.34.2 peer-group remotesites
 neighbor 10.10.34.2 description ASGARD
 neighbor 10.10.34.6 remote-as 65511
 neighbor 10.10.34.6 peer-group remotesites
 neighbor 10.10.34.6 description MOMv4
 neighbor 10.10.34.10 remote-as 65502
 neighbor 10.10.34.10 peer-group remotesites
 neighbor 10.10.34.10 description SAKAARv4
 neighbor 10.10.34.14 remote-as 65512
 neighbor 10.10.34.14 peer-group remotesites
 neighbor 10.10.34.14 description DADv4
 neighbor 10.10.34.18 remote-as 65503
 neighbor 10.10.34.18 peer-group remotesites
 neighbor 10.10.34.18 description MIDGUARDv4
 neighbor 2a12:4946:4001:34::1:2 remote-as 65501
 neighbor 2a12:4946:4001:34::1:2 peer-group remotesites
 neighbor 2a12:4946:4001:34::1:2 description ASGARDv6
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor remotesites soft-reconfiguration inbound
  neighbor remotesites route-map rm-REMOTE_SITES-in in
  neighbor remotesites route-map rm-REMOTE_SITES-out out
  neighbor 10.10.34.2 activate
  neighbor 10.10.34.2 route-map rm-ASGARD-out out
  neighbor 10.10.34.6 activate
  neighbor 10.10.34.10 activate
  neighbor 10.10.34.14 activate
  neighbor 10.10.34.18 activate
  neighbor 2a12:4946:4001:34::1:2 route-map rm-ASGARD-out out
  import vrf default
 exit-address-family
 !
 address-family ipv6 unicast
  aggregate-address 2a12:4946:4000::/44
  redistribute connected
  neighbor remotesites soft-reconfiguration inbound
  neighbor remotesites route-map rm-REMOTE_SITES-in in
  neighbor remotesites route-map rm-REMOTE_SITES-out out
  neighbor 10.10.34.2 route-map rm-ASGARD-out out
  neighbor 2a12:4946:4001:34::1:2 activate
  neighbor 2a12:4946:4001:34::1:2 route-map rm-ASGARD-out out
  import vrf route-map rm-DEFUALT_TO_REMOTE
  import vrf default
 exit-address-family
exit
!
router bgp 211184 vrf locix
 bgp router-id 185.1.167.89
 no bgp default ipv4-unicast
 no bgp client-to-client reflection
 bgp graceful-restart
 no bgp network import-check
 neighbor locix peer-group
 neighbor locix remote-as 202409
 neighbor locix timers 20 60
 neighbor 2001:7f8:f2:e1::be5a peer-group locix
 neighbor 2001:7f8:f2:e1::be5a description Locix RS3
 neighbor 2001:7f8:f2:e1::babe:1 peer-group locix
 neighbor 2001:7f8:f2:e1::babe:1 description Locix RS1
 neighbor 2001:7f8:f2:e1::dead:1 peer-group locix
 neighbor 2001:7f8:f2:e1::dead:1 description Locix RS2
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor locix soft-reconfiguration inbound
  neighbor locix route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor locix route-map rm-AS211184-exact out
 exit-address-family
 !
 address-family ipv6 unicast
  network 2a12:4946:4000::/44
  redistribute connected
  neighbor locix activate
  neighbor locix soft-reconfiguration inbound
  neighbor locix route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor locix route-map rm-AS211184-exact out
  import vrf route-map rm-AS211184
  import vrf default
 exit-address-family
exit
!
router bgp 211184 vrf kleyrex
 bgp router-id 193.189.83.6
 no bgp default ipv4-unicast
 no bgp client-to-client reflection
 bgp graceful-restart
 no bgp network import-check
 neighbor kleyrex peer-group
 neighbor kleyrex remote-as 31142
 neighbor kleyrex timers 20 60
 neighbor 2001:7f8:33::a103:1142:1 peer-group kleyrex
 neighbor 2001:7f8:33::a103:1142:1 description Kleyrex RS1
 neighbor 2001:7f8:33::a103:1142:2 peer-group kleyrex
 neighbor 2001:7f8:33::a103:1142:2 description Kleyrex RS2
 neighbor 2001:7f8:33::a103:1142:3 peer-group kleyrex
 neighbor 2001:7f8:33::a103:1142:3 description Kleyrex RS3
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor kleyrex soft-reconfiguration inbound
  neighbor kleyrex route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor kleyrex route-map rm-AS211184-exact out
 exit-address-family
 !
 address-family ipv6 unicast
  network 2a12:4946:4000::/44
  redistribute connected
  neighbor kleyrex activate
  neighbor kleyrex soft-reconfiguration inbound
  neighbor kleyrex route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor kleyrex route-map rm-AS211184-exact out
  import vrf route-map rm-AS211184
  import vrf default
 exit-address-family
exit
!
router bgp 211184 vrf fogixp
 bgp router-id 185.1.147.45
 no bgp default ipv4-unicast
 no bgp client-to-client reflection
 bgp graceful-restart
 no bgp network import-check
 neighbor fogixp peer-group
 neighbor fogixp timers 20 60
 neighbor 2001:7f8:ca:1::77 remote-as 210387
 neighbor 2001:7f8:ca:1::77 peer-group fogixp
 neighbor 2001:7f8:ca:1::77 description Bubacarr Sowe
 neighbor 2001:7f8:ca:1::111 remote-as 47498
 neighbor 2001:7f8:ca:1::111 peer-group fogixp
 neighbor 2001:7f8:ca:1::111 description FogIXP RS1
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor fogixp soft-reconfiguration inbound
  neighbor fogixp route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor fogixp route-map rm-AS211184-exact out
 exit-address-family
 !
 address-family ipv6 unicast
  network 2a12:4946:4000::/44
  redistribute connected
  neighbor fogixp activate
  neighbor fogixp soft-reconfiguration inbound
  neighbor fogixp route-map rm-DEFAULT_ONLY_PUBLIC in
  neighbor fogixp route-map rm-AS211184-exact out
  import vrf route-map rm-AS211184
  import vrf default
 exit-address-family
exit
!
ip prefix-list pl-default-public seq 5 permit 0.0.0.0/0 ge 0 le 24
ip prefix-list pl-default seq 5 permit 0.0.0.0/0
ip prefix-list pl-RFC1918 seq 5 permit 10.0.0.0/8 ge 8
ip prefix-list pl-RFC1918 seq 10 permit 172.16.0.0/12 ge 12
ip prefix-list pl-RFC1918 seq 15 permit 192.168.0.0/16 ge 16
ip prefix-list pl-LocalIpv4s seq 5 permit 10.10.33.1/32
ip prefix-list pl-LocalIpv4s seq 10 permit 193.148.249.0/24
ip prefix-list pl-LocalIpv4s seq 15 permit 185.1.166.0/23
ip prefix-list pl-MGMT seq 5 permit 172.16.33.0/24
ip prefix-list pl-MGMT seq 10 permit 10.33.0.0/24
ip prefix-list pl-MGMT seq 15 permit 10.10.33.1/32
!
ipv6 prefix-list pl-default-public seq 5 permit ::/0 ge 0 le 48
ipv6 prefix-list pl-default seq 5 permit ::/0
ipv6 prefix-list pl-AS211184-exact seq 5 permit 2a12:4946:4000::/44
ipv6 prefix-list pl-AS211184 seq 5 permit 2a12:4946:4000::/44 ge 44
!
route-map rm-DEFAULT_ONLY_PUBLIC permit 10
 match ip address prefix-list pl-default-public
exit
!
route-map rm-DEFAULT_ONLY_PUBLIC permit 20
 match ipv6 address prefix-list pl-default-public
exit
!
route-map rm-DEFUALT_TO_REMOTE permit 10
 match ipv6 address prefix-list pl-AS211184
exit
!
route-map rm-DEFUALT_TO_REMOTE permit 20
 match ipv6 address prefix-list pl-default
exit
!
route-map rm-AS211184-exact permit 10
 match ipv6 address prefix-list pl-AS211184-exact
exit
!
route-map rm-AS211184 permit 10
 match ipv6 address prefix-list pl-AS211184
exit
!
route-map rm-REMOTE_SITES-in permit 10
 match ipv6 address prefix-list pl-AS211184
exit
!
route-map rm-REMOTE_SITES-in permit 20
 match ip address prefix-list pl-RFC1918
exit
!
route-map rm-REMOTE_SITES-out permit 10
 match ipv6 address prefix-list pl-AS211184
exit
!
route-map rm-REMOTE_SITES-out permit 20
 match ip address prefix-list pl-MGMT
exit
!
route-map rm-ASGARD-out permit 10
 match ipv6 address prefix-list pl-AS211184
exit
!
route-map rm-ASGARD-out permit 20
 match ip address prefix-list pl-RFC1918
exit
!
route-map rm-ASGARD-out permit 30
 match ipv6 address prefix-list pl-default
exit
!
route-map rm-LocalIpv4s permit 10
 match ip address prefix-list pl-LocalIpv4s
exit
!
route-map rm-default permit 10
 match ipv6 address prefix-list pl-default
exit
!
route-map rm-default permit 20
 match ip address prefix-list pl-default
exit
!
end

This is the config of the host running 8.3

33Fraise33 commented 2 years ago

Hello,

Do you have any update to this, currently graphs look like this because of this :/ image

tynany commented 2 years ago

Apologies for the delay. Time has not been on my side recently. I am planning on looking at it this weekend, so should have a fix within a week. Cheers!

33Fraise33 commented 2 years ago

No worries, have a nice weekend!

tynany commented 2 years ago

I am still unable to replicate the issue you are seeing. Here is what I did:

  1. Fresh version of Debian 11.5.0.
  2. Install FRR 8.3.1.
  3. Copy the config you provided to /etc/frr/frr.conf.
  4. Enable bgpd in /etc/frr/daemons.
  5. Install FRR Exporter 1.1.2 to /usr/local/bin/frr_exporter.
  6. Copy the Systemd file you provided to /etc/systemd/system/frr_exporter.service.

I am also unable to replicate the issue using Cumulus VX.

Can you please run https://github.com/tynany/frr_exporter/releases/tag/v1.1.3-rc1 and copy the output here? I have set up better logging.

From the dashboard you provided, it looks like the scrape is occasionally successful. Is that correct?

Thanks

dswarbrick commented 2 years ago

I can see nothing wrong with the outputted JSON, and it even unmarshals totally fine.

I'm presuming that the error is error is being caught and logged by this code in collectBGP:

    if err := processBGPSummary(ch, jsonBGPSum, AFI, SAFI, logger, desc); err != nil {
        return cmdOutputProcessError(cmd, string(jsonBGPSum), err)
    }

However, there are multiple json.Unmarshal() invocations within processBGPSummary. The first is merely:

    var jsonMap map[string]bgpProcess
    if err := json.Unmarshal(jsonBGPSum, &jsonMap); err != nil {
        return err
    }

I think we can rule that out, since testing that with the reported command output is fine.

But a few lines later, getBGPPeerDesc is called, which is a completely separate FRR command (show bgp vrf all neighbors json).

    if *bgpPeerTypes || *bgpPeerDescs {
        peerDescJSON, peerDescText, err = getBGPPeerDesc(logger)
        if err != nil {
            return err
        }
    }

That function calls processBGPPeerDesc, which is where we have multiple json.Unmarshal calls, and I suspect that that is failing, and its error is bubbling all the way up the call stack. The raw command output for the unmarshal in processBGPPeerDesc is not logged, so I can't say for certain why that would fail.

tynany commented 2 years ago

@dswarbrick Thanks for looking into it!

I agree with you and have set up better logging in https://github.com/tynany/frr_exporter/pull/80 (https://github.com/tynany/frr_exporter/releases/tag/v1.1.3-rc1) to capture the input of the failing unmarshal.

I will take this opportunity to redo logging as the error returned in the PR is misleading. I will likely ask for your review.

dswarbrick commented 2 years ago

@tynany It probably wouldn't hurt to sprinkle some debug logging of such command outputs, which can be left in there permanently and optionally viewed by end users, depending on their configured log level. This can really help when trying to troubleshoot issues such as this, where we don't entirely trust the output of some external source that we need to parse.

I would also recommend wrapping errors with some context, rather than just returning them raw. For example, this kind of thing would help identify where errors have sprung up from, especially if they're buried multiple functions deep, e.g. in processBGPPeerDesc:

                var neighborData bgpBGPNeighbor
                if err := json.Unmarshal(neighborValues, &neighborData); err != nil {
                    return nil, nil, fmt.Errorf("unable to parse BGP neighbor data: %w", err)
                }

This will preserve (wrap) the original error thrown by json.Unmarshal, but provides some user-friendly context as well.

33Fraise33 commented 2 years ago

Hello,

Here is the new log running 1.1.3-rc1

frr_exporter.log

dswarbrick commented 2 years ago

So, as suspected, the JSON output of show bgp vrf all neighbors json has been truncated, ending abruptly with:

    "neighborCapabilities":{
      "4byteAs":"advertisedAndReceived",
      "extendedMessage":"advertised",
      "addPath":{
        "ipv6Unicast":{
          "rxAdvertised":true

@33Fraise33 Are you able to run that command manually from vtysh, to verify whether FRR is truncating it before frr_exporter even reads it?

33Fraise33 commented 2 years ago

Hi,

When I run the command manually I do not notice this behaviour. I seemlingly get the full output every time. neighbors.log

tynany commented 2 years ago

@33Fraise33 what happens if you run FRR Exporter with the --frr.vtysh flag? I want to see if the issue is only present when using the Unix socket

tynany commented 2 years ago

@33Fraise33 I have unsuccessfully tried reproducing this issue by configuring many established BGP peers on both Cumulus Linux 4.4.4 and Debian running FRR 8.3. The output of show bgp vrf all neighbors json is ~4,000 of lines, which is greater than the ~2,000 from your log.

I have the same issue on a clean debian host with FRR 8.3

Can you please provide the exact steps, ideally without any established peers?

33Fraise33 commented 2 years ago

Hello,

I am willing to provide you with access to my person server where I have this issue, can you contact me? There you can test what I encounter. We also have this in production at the company I work at so it does not seem to be limited to my specific setup.

Contact detail: github+tynany @ frai.se

tynany commented 2 years ago

Thanks. I have just reached out.

tynany commented 2 years ago

Thanks for granting me access to troubleshoot, as it helped me determine the cause.

FRR was prematurely returning data read from the FRR socket due to this line. The point of return in the text output was different across each run, which is odd, but I imagine FRR is returning \x00 to split up large data.

Luckily, no code changes are needed as @dswarbrick refactored how FRR Exporter interfaces with the FRR socket (https://github.com/tynany/frr_exporter/pull/81), fixing the issue.

The issue is no longer present in v1.1.3.