Exa-Networks / exabgp

The BGP swiss army knife of networking
Other
2.09k stars 448 forks source link

Repeated announce or withdraw for the same prefix crashes the API #56

Closed pwbristow closed 11 years ago

pwbristow commented 11 years ago

You can cause the BGP process to go into the weeds with the following. If you need anything else let me know.

CentOS 6.4 python-2.6.6-37.el6_4.x86_64 bird-1.3.11-1.x86_64 exabgp-3.2.13

Peer Bird config

router id 172.16.112.21;
protocol device { scan time 10; }
protocol direct { interface "eth0"; }
protocol bgp { multihop 1; local as 56673; neighbor 10.0.0.1 as 56673; source address 10.0.0.0; rr client; import all; passive;}

Exacfg

neighbor 10.0.0.0 {
        router-id 10.0.0.1;
        local-address 10.0.0.1;
        local-as 56673;
        peer-as 56673;
        hold-time 180;

        process service-dynamic {
                run /bin/cat /tmp/script;
        }
}
cat /tmp/apiscript
#!/bin/sh
sleep 10
echo "announce route 172.116.10.0/24 next-hop 10.0.0.1 community [123:321]"
sleep 2
echo "announce route 172.116.10.0/24 next-hop 10.0.0.1 community [123:321 123:322]"
sleep 2

#Either of the following lines break things
#echo "withdraw route 172.116.10.0/24 next-hop 10.0.0.1 community [123:321 123:322]"
#echo "withdraw route 172.116.10.0/24 next-hop 10.0.0.1" 
sleep 100

Wed, 02 Oct 2013 07:45:15 | INFO | 64075 | reactor | New Peer neighbor 10.0.0.0 local-ip 10.0.0.1 local-as 56673 peer-as 56673 router-id 10.0.0.1 family-allowed in-open Wed, 02 Oct 2013 07:45:15 | WARNING | 64075 | configuration | Loaded new configuration successfully Wed, 02 Oct 2013 07:45:15 | INFO | 64075 | processes | Forked process service-dynamic Wed, 02 Oct 2013 07:45:16 | INFO | 64075 | network | Connected to peer neighbor 10.0.0.0 local-ip 10.0.0.1 local-as 56673 peer-as 56673 router-id 10.0.0.1 family-allowed in-open (out) Wed, 02 Oct 2013 07:45:25 | INFO | 64075 | processes | Command from process service-dynamic : announce route 172.116.10.0/24 next-hop 10.0.0.1 community [123:321] Wed, 02 Oct 2013 07:45:25 | INFO | 64075 | reactor | Route added to neighbor 10.0.0.0 local-ip 10.0.0.1 local-as 56673 peer-as 56673 router-id 10.0.0.1 family-allowed in-open : 172.116.10.0/24 next-hop 10.0.0.1 community 123:321 Wed, 02 Oct 2013 07:45:26 | INFO | 64075 | reactor | Performing dynamic route update Wed, 02 Oct 2013 07:45:26 | INFO | 64075 | reactor | Updated peers dynamic routes successfully Wed, 02 Oct 2013 07:45:27 | INFO | 64075 | processes | Command from process service-dynamic : announce route 172.116.10.0/24 next-hop 10.0.0.1 community [123:321 123:322] Wed, 02 Oct 2013 07:45:27 | INFO | 64075 | reactor | Route added to neighbor 10.0.0.0 local-ip 10.0.0.1 local-as 56673 peer-as 56673 router-id 10.0.0.1 family-allowed in-open : 172.116.10.0/24 next-hop 10.0.0.1 community [ 123:321 123:322 ] Wed, 02 Oct 2013 07:45:28 | INFO | 64075 | reactor | Performing dynamic route update Wed, 02 Oct 2013 07:45:28 | INFO | 64075 | reactor | Updated peers dynamic routes successfully Wed, 02 Oct 2013 07:45:29 | INFO | 64075 | processes | Command from process service-dynamic : withdraw route 172.116.10.0/24 next-hop 10.0.0.1 Wed, 02 Oct 2013 07:45:29 | INFO | 64075 | reactor | Route found and removed : 172.116.10.0/24 next-hop 10.0.0.1 Wed, 02 Oct 2013 07:45:30 | INFO | 64075 | reactor | Performing dynamic route update Wed, 02 Oct 2013 07:45:30 | INFO | 64075 | reactor | Updated peers dynamic routes successfully Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | reactor | peer 10.0.0.0 ASN 56673 UNHANDLED PROBLEM, please report Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | reactor | peer 10.0.0.0 ASN 56673 <type 'exceptions.KeyError'> Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | reactor | peer 10.0.0.0 ASN 56673 '\x00\x00\x00\x00\x18\xact\n' Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | Traceback (most recent call last): Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | File "/srv/exabgp-3.2.13/lib/exabgp/reactor/peer.py", line 504, in _run Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | for action in self._main(direction): Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | File "/srv/exabgp-3.2.13/lib/exabgp/reactor/peer.py", line 471, in _main Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | new_routes.next() Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | File "/srv/exabgp-3.2.13/lib/exabgp/reactor/protocol.py", line 258, in new_update Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | for update in self.neighbor.rib.outgoing.updates(self.neighbor.group_updates): Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | File "/srv/exabgp-3.2.13/lib/exabgp/rib/store.py", line 196, in updates Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | del dict_nlri[nlri_index] Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | | KeyError: '\x00\x00\x00\x00\x18\xact\n' Wed, 02 Oct 2013 07:45:30 | ERROR | 64075 | |

thomas-mangin commented 11 years ago

Hi Pete, Could you check if the last commit which fixed another issue did not as a side effect fix this one too as I can not reproduce it.

tsuna commented 10 years ago

I can confirm that exabgp 3.3.0 fixed this issue. I experienced it too on 3.2.13 and the problem went away after upgrading.

thomas-mangin commented 10 years ago

Thank you very much for the feedback, much appreciated.