Closed thomas955 closed 1 year ago
The code was not handling unknown TLV type and looping forever, it is now on master a bit less worse, it will kill the session.
./sbin/exabgp decode -d "0000 00B7 900E 0095 4004 4704 0A0B 0D01 0000 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 2200 0000 0000 0000 0000 0000 0000 0100 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 0000 0000 0000 0000 0000 0000 0000 0140 0101 0040 0200 4005 0400 0000 6480 1D0D 0483 0004 0000 0000 0492 0001 20
"
parser parsing UPDATE ( 187) 0000 00B7 900E 0095 4004 4704 0A0B 0D01 0000 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 2200 0000 0000 0000 0000 0000 0000 0100 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 0000 0000 0000 0000 0000 0000 0000 0140 0101 0040 0200 4005 0400 0000 6480 1D0D 0483 0004 0000 0000 0492 0001 20
parser attribute mp-reach-nlri flag 0x90 type 0x0e len 0x95 payload 4004 4704 0A0B 0D01 0000 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 2200 0000 0000 0000 0000 0000 0000 0100 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 0000 0000 0000 0000 0000 0000 0000 01
parser NLRI bgp-ls bgp-ls without path-information payload 0004 0042 0200 0000 0000 0000 0001 0000 1A02 0000 0400 0000 0202 0100 0400 0000 0002 0300 0600 0000 0000 0101 0700 0200 0201 0900 1180 2022 0000 0000 0000 0000 0000 0000 0001 0004 0042 0200 0000 0000 0000 0001 0000 1A02 0000 0400 0000 0202 0100 0400 0000 0002 0300 0600 0000 0000 0101 0700 0200 0201 0900 1180 2000 0000 0000 0000 0000 0000 0000 0001
invalid payload
Now reporting that we could not parse part of the TLV instead.
❯ ./sbin/exabgp decode "0000 00B7 900E 0095 4004 4704 0A0B 0D01 0000 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 2200 0000 0000 0000 0000 0000 0000 0100 0400 4202 0000 0000 0000 0000 0100 001A 0200 0004 0000 0002 0201 0004 0000 0000 0203 0006 0000 0000 0001 0107 0002 0002 0109 0011 8020 0000 0000 0000 0000 0000 0000 0000 0140 0101 0040 0200 4005 0400 0000 6480 1D0D 0483 0004 0000 0000 0492 0001 20
"
unknown prefix v6 TLV 263
unknown prefix v6 TLV 263
{ "exabgp": "5.0.0", "time": 1668781676.4027362, "host" : "MacBook-Pro-2.local", "pid" : 72678, "ppid" : 65389, "counter": 1, "type": "update", "neighbor": { "address": { "local": "127.0.0.1", "peer": "127.0.0.1" }, "asn": { "local": 65533, "peer": 65533 } , "direction": "in", "message": { "update": { "attribute": { "origin": "igp", "local-preference": 100, "bgp-ls": { "prefix-metric": 0, "sr-prefix-attribute-flags": {"X": 0, "R": 0, "N": 1, "RSV": 0} } }, "announce": { "bgp-ls bgp-ls": { "10.11.13.1": [ { "ls-nlri-type": "bgpls-prefix-v6", "l3-routing-topology": 0, "protocol-id": 2, "node-descriptors": [ { "autonomous-system": 2 }, { "bgp-ls-identifier": "0" }, { "router-id": "000000000001" } ], "ip-reachability-tlv": "2022::1", "ip-reach-prefix": "2022::1/128", "nexthop": "10.11.13.1" }, { "ls-nlri-type": "bgpls-prefix-v6", "l3-routing-topology": 0, "protocol-id": 2, "node-descriptors": [ { "autonomous-system": 2 }, { "bgp-ls-identifier": "0" }, { "router-id": "000000000001" } ], "ip-reachability-tlv": "2000::1", "ip-reach-prefix": "2000::1/128", "nexthop": "10.11.13.1" } ] } } } } } }
As I did not author that code and have no time right now to look at the RFC to figure out what is happening, I think it is a good compromise.
A Multi-Topology Identifier is being passed to some class which did not expect it. If it valid, the class should be extended to allow it.
https://www.rfc-editor.org/rfc/rfc7752.html#section-3.3.3 does not have MTID and as I was not the primary author of the BGPLS code, I can not see why this should be valid. If this was changed by a later RFC, I would need some pointers please.
I am closing but feel free to re-open if you can point me in the right direction.
Bug Report
We are sorry that you are experiencing an issue with ExaBGP.
Before opening this issue could you please:
commit a33b3ce7a2e5082009a84e12e779e44096156ebf (HEAD -> 4.2, origin/4.2)
Describe the bug Exabgp hangs / stalls if it gets IPv6 interface information distributed from is-is over address family bgp-ls.
To reproduce it I used a CiscoIOSXR and on Looppback 0 i added an ipv6 address like 2000::1/128. I will give you some shortened config parts out of the cisco config with the most important parts:
Now the straight forward exabgp config:
And finnaly the debug output from exabgp:
After this it hangs in some kind of endless loop. No access to cli etc anymore nor more logging.
Environment (please complete the following information):
Thank you in advance.