Exa-Networks / exabgp

The BGP swiss army knife of networking
Other
2.05k stars 440 forks source link

Unable to announce routes using exabgpcli with multiple configuration files. #1211

Closed milhauzindahauz closed 3 weeks ago

milhauzindahauz commented 3 weeks ago

Describe the bug Running exabgp, I am not able to announce routes to certain neighbors. In this example it's neighbor 10.90.0.3

I had a look at issues

To Reproduce I am running exabgp as following:

exabgp  conf/10.90.0.2.conf conf/10.90.0.3.conf

config diff

exabgp --de
# exabgp.api.ack=false

I am trying to announce routers to the neighbors using exabgpcli.

exabgpcli neighbor 10.90.0.2 announce route 12.0.0.0/24 next-hop self
exabgpcli neighbor 10.90.0.3 announce route 13.0.0.0/24 next-hop self

10.90.0.2.conf

process control_api {
    run ~/dev/agent_poc/venv312/bin/python ~/dev/agent_poc/speaker/notifier.py;
    encoder json;
}
template  {
    neighbor local {
        local-address 10.90.0.1;    # Our local update-source
        local-as 65000;        # Our local AS
        api  {
            processes [control_api];
            neighbor-changes;
            receive {
                packets;
                update;
                keepalive;
                notification;
                open;
            }
            send {
                packets;
                update;
                keepalive;
                notification;
                open;
            }
        }
    }
}
neighbor 10.90.0.2 {
    inherit local;

    peer-as 65001;
    router-id 10.90.0.2;
}⏎

10.90.0.3.conf

process control_api {
    run ~/dev/agent_poc/venv312/bin/python ~/dev/agent_poc/speaker/notifier.py;
    encoder json;
}
template  {
    neighbor local {
        local-address 10.90.0.1;    # Our local update-source
        local-as 65000;        # Our local AS
        api  {
            processes [control_api];
            neighbor-changes;
            receive {
                packets;
                update;
                keepalive;
                notification;
                open;
            }
            send {
                packets;
                update;
                keepalive;
                notification;
                open;
            }
        }
    }
}
neighbor 10.90.0.3 {
    inherit local;

    peer-as 65002;
    router-id 10.90.0.3;
}⏎

notifier.py

import socket
import sys

if __name__ == "__main__":
    try:
        while True:
            line = sys.stdin.readline().strip()

            if not any(line):
                sys.exit(0)
            try:
                with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
                    s.connect(("", 1790))
                    s.sendall((line + "\n").encode())
            except Exception:
                exit(5)

    except KeyboardInterrupt:
        pass

Expected behavior Announce to neighbor should be done succesfully by exabgpcli.

Environment (please complete the following information):

DEBUG OUTPUT

exabgp  conf/10.90.0.2.conf conf/10.90.0.3.conf -d
12:04:19 | 89047  | welcome       | Thank you for using ExaBGP
12:04:19 | 89047  | version       | 4.2.22
12:04:19 | 89047  | interpreter   | 3.12.3 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)]
12:04:19 | 89047  | welcome       | Thank you for using ExaBGP
12:04:19 | 89047  | version       | 4.2.22
12:04:19 | 89047  | interpreter   | 3.12.3 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)]
12:04:19 | 89047  | os            | Linux fedora-laptop 6.8.11-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Mon May 27 14:53:33 UTC 2024 x86_64
12:04:19 | 89047  | installation  | ~/dev/agent_poc/venv312
12:04:19 | 89047  | os            | Linux fedora-laptop 6.8.11-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Mon May 27 14:53:33 UTC 2024 x86_64
12:04:19 | 89047  | installation  | ~/dev/agent_poc/venv312
12:04:19 | 89047  | cli control   | named pipes for the cli are:
12:04:19 | 89047  | cli control   | to send commands  ~/dev/agent_poc/venv312/var/run/exabgp/exabgp.in
12:04:19 | 89047  | cli control   | to read responses ~/dev/agent_poc/venv312/var/run/exabgp/exabgp.out
12:04:19 | 89047  | cli control   | named pipes for the cli are:
12:04:19 | 89047  | cli control   | to send commands  ~/dev/agent_poc/venv312/var/run/exabgp/exabgp.in
12:04:19 | 89047  | cli control   | to read responses ~/dev/agent_poc/venv312/var/run/exabgp/exabgp.out
12:04:19 | 89047  | configuration | performing reload of exabgp 4.2.22
12:04:19 | 89047  | configuration | > process          | 'control_api'
12:04:19 | 89047  | configuration | . run              | '~/dev/agent_poc/venv312/bin/python' '~/dev/agent_poc/speaker/notifier.py'
12:04:19 | 89047  | configuration | . encoder          | 'json'
12:04:19 | 89047  | configuration | < process          |
12:04:19 | 89047  | configuration | > template         |
12:04:19 | 89047  | configuration | > neighbor         | 'local'
12:04:19 | 89047  | configuration | . local-address    | '10.90.0.1'
12:04:19 | 89047  | configuration | . local-as         | '65000'
12:04:19 | 89047  | configuration | > api              |
12:04:19 | 89047  | configuration | . processes        | '[' 'control_api' ']'
12:04:19 | 89047  | configuration | . neighbor-changes |
12:04:19 | 89047  | configuration | > receive          |
12:04:19 | 89047  | configuration | performing reload of exabgp 4.2.22
12:04:19 | 89047  | configuration | . packets          |
12:04:19 | 89047  | configuration | . update           |
12:04:19 | 89047  | configuration | . keepalive        |
12:04:19 | 89047  | configuration | . notification     |
12:04:19 | 89047  | configuration | . open             |
12:04:19 | 89047  | configuration | < receive          |
12:04:19 | 89047  | configuration | > send             |
12:04:19 | 89047  | configuration | . packets          |
12:04:19 | 89047  | configuration | . update           |
12:04:19 | 89047  | configuration | > process          | 'control_api'
12:04:19 | 89047  | configuration | . keepalive        |
12:04:19 | 89047  | configuration | . notification     |
12:04:19 | 89047  | configuration | . run              | '~/dev/agent_poc/venv312/bin/python' '~/dev/agent_poc/speaker/notifier.py'
12:04:19 | 89047  | configuration | . open             |
12:04:19 | 89047  | configuration | < send             |
12:04:19 | 89047  | configuration | < api              |
12:04:19 | 89047  | configuration | . encoder          | 'json'
12:04:19 | 89047  | configuration | < neighbor         |
12:04:19 | 89047  | configuration | < template         |
12:04:19 | 89047  | configuration | < process          |
12:04:19 | 89047  | configuration | > neighbor         | '10.90.0.2'
12:04:19 | 89047  | configuration | > template         |
12:04:19 | 89047  | configuration | > neighbor         | 'local'
12:04:19 | 89047  | configuration | . inherit          | 'local'
12:04:19 | 89047  | configuration | . local-address    | '10.90.0.1'
12:04:19 | 89047  | configuration | . peer-as          | '65001'
12:04:19 | 89047  | configuration | . router-id        | '10.90.0.2'
12:04:19 | 89047  | configuration | . local-as         | '65000'
12:04:19 | 89047  | configuration | > api              |
12:04:19 | 89047  | configuration | . processes        | '[' 'control_api' ']'
12:04:19 | 89047  | configuration | . neighbor-changes |
12:04:19 | 89047  | configuration | > receive          |
12:04:19 | 89047  | configuration | . packets          |
12:04:19 | 89047  | configuration | . update           |
12:04:19 | 89047  | configuration | . keepalive        |
12:04:19 | 89047  | configuration | . notification     |
12:04:19 | 89047  | configuration | . open             |
12:04:19 | 89047  | configuration | < receive          |
12:04:19 | 89047  | configuration | > send             |
12:04:19 | 89047  | configuration | . packets          |
12:04:19 | 89047  | configuration | < neighbor         |
12:04:19 | 89047  | configuration | . update           |
12:04:19 | 89047  | configuration | . keepalive        |
12:04:19 | 89047  | configuration | . notification     |
12:04:19 | 89047  | configuration | . open             |
12:04:19 | 89047  | configuration | < send             |
12:04:19 | 89047  | configuration | < api              |
12:04:19 | 89047  | configuration | < neighbor         |
12:04:19 | 89047  | configuration | < template         |
12:04:19 | 89047  | configuration | > neighbor         | '10.90.0.3'
12:04:19 | 89047  | reactor       | new peer: neighbor 10.90.0.2 local-ip 10.90.0.1 local-as 65000 peer-as 65001 router-id 10.90.0.2 family-allowed in-open
12:04:19 | 89047  | configuration | . inherit          | 'local'
12:04:19 | 89047  | configuration | . peer-as          | '65002'
12:04:19 | 89047  | reactor       | loaded new configuration successfully
12:04:19 | 89047  | configuration | . router-id        | '10.90.0.3'
12:04:19 | 89047  | configuration | < neighbor         |
12:04:19 | 89047  | reactor       | new peer: neighbor 10.90.0.3 local-ip 10.90.0.1 local-as 65000 peer-as 65002 router-id 10.90.0.3 family-allowed in-open
12:04:19 | 89047  | reactor       | loaded new configuration successfully
12:04:19 | 89047  | process       | forked process control_api
12:04:19 | 89047  | process       | forked process control_api
12:04:19 | 89047  | process       | forked process api-internal-cli-777d4bf6
12:04:19 | 89047  | process       | forked process api-internal-cli-777d3382
12:04:19 | 89047  | reactor       | initialising connection to peer-1
12:04:19 | 89047  | reactor       | initialising connection to peer-1
12:04:19 | 89047  | outgoing-1    | attempting connection to 10.90.0.3:179
12:04:19 | 89047  | outgoing-1    | attempting connection to 10.90.0.2:179
12:04:19 | 89047  | outgoing-1    | sending TCP payload ( 177) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 00B1 0104 FDE8 00B4 0A5A 0002 9402 0601 0400 0100 0102 0601 0400 0100 0202 0601 0400 0100 0402 0601 0400 0100 8002 0601 0400 0100 8402 0601 0400 0100 8502 0601 0400 0100 8602 0601 0400 0200 0102 0601 0400 0200 0202 0601 0400 0200 0402 0601 0400 0200 8002 0601 0400 0200 8502 0601 0400 0200 8602 0601 0400 1900 4102 0601 0400 1900 4602 0601 0440 0400 4702 0601 0440 0400 4802 0641 0400 00FD E802 0206 00
12:04:19 | 89047  | outgoing-1    | sending TCP payload ( 177) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 00B1 0104 FDE8 00B4 0A5A 0003 9402 0601 0400 0100 0102 0601 0400 0100 0202 0601 0400 0100 0402 0601 0400 0100 8002 0601 0400 0100 8402 0601 0400 0100 8502 0601 0400 0100 8602 0601 0400 0200 0102 0601 0400 0200 0202 0601 0400 0200 0402 0601 0400 0200 8002 0601 0400 0200 8502 0601 0400 0200 8602 0601 0400 1900 4102 0601 0400 1900 4602 0601 0440 0400 4702 0601 0440 0400 4802 0641 0400 00FD E802 0206 00
12:04:19 | 89047  | outgoing-1    | >> OPEN version=4 asn=65000 hold_time=180 router_id=10.90.0.2 capabilities=[Multiprotocol(ipv4 unicast,ipv4 multicast,ipv4 nlri-mpls,ipv4 mpls-vpn,ipv4 rtc,ipv4 flow,ipv4 flow-vpn,ipv6 unicast,ipv6 multicast,ipv6 nlri-mpls,ipv6 mpls-vpn,ipv6 flow,ipv6 flow-vpn,l2vpn vpls,l2vpn evpn,bgp-ls bgp-ls,bgp-ls bgp-ls-vpn), Extended Message(65535), ASN4(65000)]
12:04:19 | 89047  | outgoing-1    | >> OPEN version=4 asn=65000 hold_time=180 router_id=10.90.0.3 capabilities=[Multiprotocol(ipv4 unicast,ipv4 multicast,ipv4 nlri-mpls,ipv4 mpls-vpn,ipv4 rtc,ipv4 flow,ipv4 flow-vpn,ipv6 unicast,ipv6 multicast,ipv6 nlri-mpls,ipv6 mpls-vpn,ipv6 flow,ipv6 flow-vpn,l2vpn vpls,l2vpn evpn,bgp-ls bgp-ls,bgp-ls bgp-ls-vpn), Extended Message(65535), ASN4(65000)]
12:04:19 | 89047  | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0068 01
12:04:19 | 89047  | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0068 01
12:04:19 | 89047  | outgoing-1    | received complete TCP payload (  85) 04FD E900 780A 5A00 024B 0206 0104 0001 0001 0202 8000 0202 0200 0202 4600 0206 4104 0000 FDE9 0202 0600 0206 4504 0001 0101 0210 490E 0C35 6639 3264 3363 3533 3731 3300 0204 4002 4078 0209 4707 0001 0180 0000 00
12:04:19 | 89047  | outgoing-1    | << message of type OPEN
12:04:19 | 89047  | outgoing-1    | received complete TCP payload (  85) 04FD EA00 780A 5A00 034B 0206 0104 0001 0001 0202 8000 0202 0200 0202 4600 0206 4104 0000 FDEA 0202 0600 0206 4504 0001 0101 0210 490E 0C32 6231 3961 6261 6539 3162 3600 0204 4002 4078 0209 4707 0001 0180 0000 00
12:04:19 | 89047  | outgoing-1    | << message of type OPEN
12:04:19 | 89047  | outgoing-1    | << OPEN version=4 asn=65001 hold_time=120 router_id=10.90.0.2 capabilities=[Multiprotocol(ipv4 unicast), Route Refresh, Extended Message(65535), Graceful Restart Flags 0x4 Time 120 , ASN4(65001), AddPath(receive ipv4 unicast), Enhanced Route Refresh, Unassigned 71, Unassigned 73, Route Refresh]
12:04:19 | 89047  | outgoing-1    | << OPEN version=4 asn=65002 hold_time=120 router_id=10.90.0.3 capabilities=[Multiprotocol(ipv4 unicast), Route Refresh, Extended Message(65535), Graceful Restart Flags 0x4 Time 120 , ASN4(65002), AddPath(receive ipv4 unicast), Enhanced Route Refresh, Unassigned 71, Unassigned 73, Route Refresh]
12:04:19 | 89047  | ka-outgoing-1 | receive-timer 60 second(s) left
12:04:19 | 89047  | ka-outgoing-1 | receive-timer 60 second(s) left
12:04:19 | 89047  | outgoing-1    | --------------------------------------------------------------------
12:04:19 | 89047  | outgoing-1    | --------------------------------------------------------------------
12:04:19 | 89047  | outgoing-1    | the connection can not carry the following family/families
12:04:19 | 89047  | outgoing-1    | the connection can not carry the following family/families
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/flow
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/flow
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/nlri-mpls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/multicast
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/nlri-mpls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/unicast
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/multicast
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for l2vpn/evpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/unicast
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/flow
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for l2vpn/evpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/nlri-mpls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/flow
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/mpls-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/nlri-mpls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for bgp-ls/bgp-ls-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/mpls-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/mpls-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for bgp-ls/bgp-ls-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/rtc
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/mpls-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/multicast
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/rtc
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/flow-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/multicast
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for l2vpn/vpls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv6/flow-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for bgp-ls/bgp-ls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for l2vpn/vpls
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/flow-vpn
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for bgp-ls/bgp-ls
12:04:19 | 89047  | outgoing-1    | therefore no routes of this kind can be announced on the connection
12:04:19 | 89047  | outgoing-1    |  - peer is not configured for ipv4/flow-vpn
12:04:19 | 89047  | outgoing-1    | --------------------------------------------------------------------
12:04:19 | 89047  | outgoing-1    | therefore no routes of this kind can be announced on the connection
12:04:19 | 89047  | outgoing-1    | --------------------------------------------------------------------
12:04:19 | 89047  | outgoing-1    | sending TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
12:04:19 | 89047  | outgoing-1    | sending TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
12:04:19 | 89047  | outgoing-1    | >> KEEPALIVE (OPENCONFIRM)
12:04:19 | 89047  | outgoing-1    | >> KEEPALIVE (OPENCONFIRM)
12:04:19 | 89047  | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
12:04:19 | 89047  | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
12:04:19 | 89047  | outgoing-1    | << message of type KEEPALIVE
12:04:19 | 89047  | outgoing-1    | << message of type KEEPALIVE
12:04:19 | 89047  | ka-outgoing-1 | receive-timer 120 second(s) left
12:04:19 | 89047  | ka-outgoing-1 | receive-timer 120 second(s) left
12:04:19 | 89047  | reactor       | connected to peer-1 with outgoing-1 10.90.0.1-10.90.0.3
12:04:19 | 89047  | reactor       | connected to peer-1 with outgoing-1 10.90.0.1-10.90.0.2
12:04:19 | 89047  | outgoing-1    | sending TCP payload (  23) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0017 0200 0000 00
12:04:19 | 89047  | outgoing-1    | >> EOR ipv4 unicast
12:04:19 | 89047  | outgoing-1    | sending TCP payload (  23) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0017 0200 0000 00
12:04:19 | 89047  | peer-1        | >> EOR(s)
12:04:19 | 89047  | outgoing-1    | >> EOR ipv4 unicast
12:04:19 | 89047  | peer-1        | >> EOR(s)
12:04:19 | 89047  | process       | process control_api ended, restarting it
12:04:19 | 89047  | process       | terminating process control_api
12:04:19 | 89047  | process       | process control_api ended, restarting it
12:04:19 | 89047  | process       | terminating process control_api
12:04:19 | 89047  | process       | forked process control_api
12:04:19 | 89047  | process       | forked process control_api
12:04:20 | 89047  | ka-outgoing-1 | receive-timer 119 second(s) left
12:04:20 | 89047  | ka-outgoing-1 | receive-timer 119 second(s) left
12:04:20 | 89047  | ka-outgoing-1 | send-timer 39 second(s) left
12:04:20 | 89047  | ka-outgoing-1 | send-timer 39 second(s) left
12:04:21 | 89047  | ka-outgoing-1 | receive-timer 118 second(s) left
12:04:21 | 89047  | ka-outgoing-1 | send-timer 38 second(s) left
12:04:21 | 89047  | ka-outgoing-1 | receive-timer 118 second(s) left
12:04:21 | 89047  | ka-outgoing-1 | send-timer 38 second(s) left
12:04:22 | 89047  | ka-outgoing-1 | receive-timer 117 second(s) left
12:04:22 | 89047  | ka-outgoing-1 | receive-timer 117 second(s) left
12:04:22 | 89047  | ka-outgoing-1 | send-timer 37 second(s) left
12:04:22 | 89047  | ka-outgoing-1 | send-timer 37 second(s) left
12:04:23 | 89047  | ka-outgoing-1 | receive-timer 116 second(s) left
12:04:23 | 89047  | ka-outgoing-1 | receive-timer 116 second(s) left
12:04:23 | 89047  | ka-outgoing-1 | send-timer 36 second(s) left
12:04:23 | 89047  | ka-outgoing-1 | send-timer 36 second(s) left
12:04:24 | 89047  | ka-outgoing-1 | receive-timer 115 second(s) left
12:04:24 | 89047  | ka-outgoing-1 | receive-timer 115 second(s) left
12:04:24 | 89047  | ka-outgoing-1 | send-timer 35 second(s) left
12:04:24 | 89047  | ka-outgoing-1 | send-timer 35 second(s) left
12:04:24 | 89047  | process       | command from process api-internal-cli-777d3382 : neighbor 10.90.0.2 announce route 12.0.0.0/24 next-hop self
12:04:24 | 89047  | reactor       | async | api-internal-cli-777d3382 | neighbor 10.90.0.2 announce route 12.0.0.0/24 next-hop self
12:04:24 | 89047  | configuration | . route            | '12.0.0.0/24' 'next-hop' 'self'
12:04:24 | 89047  | api           | route added to neighbor 10.90.0.2 local-ip 10.90.0.1 local-as 65000 peer-as 65001 router-id 10.90.0.2 family-allowed in-open : 12.0.0.0/24 next-hop self
12:04:24 | 89047  | parser        | parsing UPDATE (  28) 0000 0014 4001 0100 4002 0602 0100 00FD E840 0304 0A5A 0001 180C 0000
12:04:24 | 89047  | routes        | withdrawn NLRI none
12:04:24 | 89047  | parser        | attribute origin             flag 0x40 type 0x01 len 0x01 payload 00
12:04:24 | 89047  | parser        | attribute as-path            flag 0x40 type 0x02 len 0x06 payload 0201 0000 FDE8
12:04:24 | 89047  | parser        | attribute next-hop           flag 0x40 type 0x03 len 0x04 payload 0A5A 0001
12:04:24 | 89047  | parser        | NLRI      ipv4 unicast       without path-information     payload 180C 0000
12:04:24 | 89047  | routes        | announced NLRI 12.0.0.0/24 next-hop 10.90.0.1
12:04:24 | 89047  | parser        | decoded UPDATE (   0) json { "exabgp": "4.0.1", "time": 1718359464.0426085, "host" : "fedora-laptop", "pid" : 89049, "ppid" : 89047, "counter": 8, "type": "update", "neighbor": { "address": { "local": "10.90.0.1", "peer": "10.90.0.2" }, "asn": { "local": 65000, "peer": 65001 } , "direction": "in", "message": { "update": { "attribute": { "origin": "igp", "as-path": [ 65000 ], "confederation-path": [] }, "announce": { "ipv4 unicast": { "10.90.0.1": [ { "nlri": "12.0.0.0/24" } ] } } } } } }
12:04:24 | 89047  | outgoing-1    | sending TCP payload (  47) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002F 0200 0000 1440 0101 0040 0206 0201 0000 FDE8 4003 040A 5A00 0118 0C00 00
12:04:24 | 89047  | outgoing-1    | >> 1 UPDATE(s)
12:04:24 | 89047  | process       | process control_api ended, restarting it
12:04:24 | 89047  | process       | terminating process control_api
12:04:24 | 89047  | process       | forked process control_api
12:04:25 | 89047  | ka-outgoing-1 | receive-timer 114 second(s) left
12:04:25 | 89047  | ka-outgoing-1 | receive-timer 114 second(s) left
12:04:25 | 89047  | ka-outgoing-1 | send-timer 34 second(s) left
12:04:25 | 89047  | ka-outgoing-1 | send-timer 34 second(s) left
12:04:26 | 89047  | ka-outgoing-1 | receive-timer 113 second(s) left
12:04:26 | 89047  | ka-outgoing-1 | receive-timer 113 second(s) left
12:04:26 | 89047  | ka-outgoing-1 | send-timer 33 second(s) left
12:04:26 | 89047  | ka-outgoing-1 | send-timer 33 second(s) left
12:04:27 | 89047  | ka-outgoing-1 | receive-timer 112 second(s) left
12:04:27 | 89047  | ka-outgoing-1 | send-timer 32 second(s) left
12:04:27 | 89047  | ka-outgoing-1 | receive-timer 112 second(s) left
12:04:27 | 89047  | ka-outgoing-1 | send-timer 32 second(s) left
12:04:28 | 89047  | ka-outgoing-1 | receive-timer 111 second(s) left
12:04:28 | 89047  | ka-outgoing-1 | send-timer 31 second(s) left
12:04:28 | 89047  | ka-outgoing-1 | receive-timer 111 second(s) left
12:04:28 | 89047  | ka-outgoing-1 | send-timer 31 second(s) left
12:04:29 | 89047  | ka-outgoing-1 | receive-timer 110 second(s) left
12:04:29 | 89047  | ka-outgoing-1 | send-timer 30 second(s) left
12:04:29 | 89047  | ka-outgoing-1 | receive-timer 110 second(s) left
12:04:29 | 89047  | ka-outgoing-1 | send-timer 30 second(s) left
12:04:30 | 89047  | ka-outgoing-1 | receive-timer 109 second(s) left
12:04:30 | 89047  | ka-outgoing-1 | send-timer 29 second(s) left
12:04:30 | 89047  | ka-outgoing-1 | receive-timer 109 second(s) left
12:04:30 | 89047  | ka-outgoing-1 | send-timer 29 second(s) left
12:04:31 | 89047  | ka-outgoing-1 | receive-timer 108 second(s) left
12:04:31 | 89047  | ka-outgoing-1 | receive-timer 108 second(s) left
12:04:31 | 89047  | ka-outgoing-1 | send-timer 28 second(s) left
12:04:31 | 89047  | ka-outgoing-1 | send-timer 28 second(s) left
12:04:32 | 89047  | ka-outgoing-1 | receive-timer 107 second(s) left
12:04:32 | 89047  | ka-outgoing-1 | receive-timer 107 second(s) left
12:04:32 | 89047  | ka-outgoing-1 | send-timer 27 second(s) left
12:04:32 | 89047  | ka-outgoing-1 | send-timer 27 second(s) left
12:04:33 | 89047  | ka-outgoing-1 | receive-timer 106 second(s) left
12:04:33 | 89047  | ka-outgoing-1 | receive-timer 106 second(s) left
12:04:33 | 89047  | ka-outgoing-1 | send-timer 26 second(s) left
12:04:33 | 89047  | ka-outgoing-1 | send-timer 26 second(s) left
12:04:34 | 89047  | ka-outgoing-1 | receive-timer 105 second(s) left
12:04:34 | 89047  | ka-outgoing-1 | send-timer 25 second(s) left
12:04:34 | 89047  | ka-outgoing-1 | receive-timer 105 second(s) left
12:04:34 | 89047  | ka-outgoing-1 | send-timer 25 second(s) left
12:04:35 | 89047  | ka-outgoing-1 | receive-timer 104 second(s) left
12:04:35 | 89047  | ka-outgoing-1 | receive-timer 104 second(s) left
12:04:35 | 89047  | ka-outgoing-1 | send-timer 24 second(s) left
12:04:35 | 89047  | ka-outgoing-1 | send-timer 24 second(s) left
12:04:36 | 89047  | ka-outgoing-1 | receive-timer 103 second(s) left
12:04:36 | 89047  | ka-outgoing-1 | receive-timer 103 second(s) left
12:04:36 | 89047  | ka-outgoing-1 | send-timer 23 second(s) left
12:04:36 | 89047  | ka-outgoing-1 | send-timer 23 second(s) left
12:04:36 | 89047  | process       | command from process api-internal-cli-777d3382 : neighbor 10.90.0.3 announce route 13.0.0.0/24 next-hop self
12:04:36 | 89047  | reactor       | async | api-internal-cli-777d3382 | neighbor 10.90.0.3 announce route 13.0.0.0/24 next-hop self
12:04:36 | 89047  | api           | no neighbor matching the command : announce route 13.0.0.0/24 next-hop self
12:04:37 | 89047  | ka-outgoing-1 | receive-timer 102 second(s) left
12:04:37 | 89047  | ka-outgoing-1 | send-timer 22 second(s) left
12:04:37 | 89047  | ka-outgoing-1 | receive-timer 102 second(s) left
12:04:37 | 89047  | ka-outgoing-1 | send-timer 22 second(s) left
12:04:38 | 89047  | ka-outgoing-1 | receive-timer 101 second(s) left
12:04:38 | 89047  | ka-outgoing-1 | receive-timer 101 second(s) left
12:04:38 | 89047  | ka-outgoing-1 | send-timer 21 second(s) left
12:04:38 | 89047  | ka-outgoing-1 | send-timer 21 second(s) left
12:04:39 | 89047  | ka-outgoing-1 | receive-timer 100 second(s) left
12:04:39 | 89047  | ka-outgoing-1 | receive-timer 100 second(s) left
12:04:39 | 89047  | ka-outgoing-1 | send-timer 20 second(s) left
12:04:39 | 89047  | ka-outgoing-1 | send-timer 20 second(s) left
12:04:40 | 89047  | ka-outgoing-1 | receive-timer 99 second(s) left
12:04:40 | 89047  | ka-outgoing-1 | receive-timer 99 second(s) left
12:04:40 | 89047  | ka-outgoing-1 | send-timer 19 second(s) left
12:04:40 | 89047  | ka-outgoing-1 | send-timer 19 second(s) left
12:04:41 | 89047  | ka-outgoing-1 | receive-timer 98 second(s) left
12:04:41 | 89047  | ka-outgoing-1 | send-timer 18 second(s) left
12:04:41 | 89047  | ka-outgoing-1 | receive-timer 98 second(s) left
12:04:41 | 89047  | ka-outgoing-1 | send-timer 18 second(s) left
12:04:42 | 89047  | ka-outgoing-1 | receive-timer 97 second(s) left
12:04:42 | 89047  | ka-outgoing-1 | send-timer 17 second(s) left
12:04:42 | 89047  | ka-outgoing-1 | receive-timer 97 second(s) left
12:04:42 | 89047  | ka-outgoing-1 | send-timer 17 second(s) left
^C12:04:42 | 89047  | reactor       | ^C received
12:04:42 | 89047  | reactor       | ^C received
12:04:42 | 89047  | reactor       | performing shutdown
12:04:42 | 89047  | reactor       | performing shutdown
12:04:42 | 89047  | outgoing-1    | stop, message [shutting down]
12:04:42 | 89047  | outgoing-1    | stop, message [shutting down]
12:04:42 | 89047  | outgoing-1    | outgoing-1 10.90.0.1-10.90.0.2, closing connection
12:04:42 | 89047  | outgoing-1    | outgoing-1 10.90.0.1-10.90.0.3, closing connection
12:04:42 | 89047  | process       | terminating process api-internal-cli-777d4bf6
12:04:42 | 89047  | process       | terminating process api-internal-cli-777d3382
12:04:42 | 89047  | process       | terminating process control_api
12:04:42 | 89047  | process       | terminating process control_api
thomas-mangin commented 3 weeks ago

When exabgp is provided with multiple configuration files, it will fork an instance per file and run it. Each process is fully independent and can not access the resources of the other. You will also have a problem if you use the CLI as they will both be attempting to access the same underlying pipe.

While not documented, this limitation becomes logical once you understand what is in the background. However, it's understandable that this concept may not be immediately clear.

To do what you want, you can:

It's important to note that when using a single configuration file, exabgp will only utilise a single core, as opposed to the 2 cores used when forking.

To preempt a question I was already asked, I am not looking at adding configuration templating to exabgp. There are good external tools for it, and the parsing code is complex enough as it is—sorry.

for your script what you want is more likely to be something like:

import socket
import sys

s = None
backlog = []

def main():
    while True:
        line = sys.stdin.readline().strip()

        if not any(line):
            sys.exit(0)

        backlog.append(line + '\n')
        # if it grows too much perhaps better bomb out?

        if s is None:
            s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            s.connect(("", 1790))

        try:
            while backlog:
                sending = backlog[0]
                s.sendall((sending).encode())
                backlog = backlog[1:]
        except Exception:
            s = None
            exit(5)

if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        pass
milhauzindahauz commented 3 weeks ago

When exabgp is provided with multiple configuration files, it will fork an instance per file and run it. Each process is fully independent and can not access the resources of the other. You will also have a problem if you use the CLI as they will both be attempting to access the same underlying pipe.

While not documented, this limitation becomes logical once you understand what is in the background. However, it's understandable that this concept may not be immediately clear.

I noticed this behavior. But I assumed that CLI is aware of it and can handle this. Apologies for misunderstanding.

My reasoning to split peers configs to individual files is based on following

Reloading large configuration using signal is not recommended as the configuration parsing code is currently blocking (as well as some part of the RIB required for the reload - for simplicity).

Large configuration file is pretty vague term. I will have 100-150 peers in production environment. And I don't know if the config generate for that qualifies as large one. So I prematurely optimized it.

To preempt a question I was already asked, I am not looking at adding configuration templaing to exabgp. There are good external tools for it, and the parsing code is complex enough as it is—sorry.

Nah buddy, keep your focus as it is. Generate config by Jinja is quite simple and straight forward.

...
       except Exception:
            if s is not None:
                s.close()
                s = None
            exit(5)

I assume to close IO is good habit to have.

But can you clarify me what is the reasoning for using backlog buffer? If there is line from input it's always one line which is afterwards sent through the socket. What am I missing?

thomas-mangin commented 3 weeks ago

No apology was necessary; the use case for the fork feature was very specific to one user and was created before the CLI. I never considered how the two features could/would conflict. It is possible to fix it; it is "just" not implemented.

The signal code in ExaBGP is the source of the 'last' long-lasting opened issue, which has a significant impact on the system. It was designed before the API, and given the choice, I would remove the feature. Instead, I would add an 'add peer'/'remove peer' API call, which has not been done yet.

Having 150 peers for a single core can be fine if you are not handling a full routing table. However, the CPU usage is proportional to the number of peers multiplied by the load per peer, which can lead to one peer affecting another. While such large setups are not uncommon, it was not the scale I had in mind when designing ExaBGP; I would not have used Python for such a large setup.

Having multiple ExaBGP is fine and good as it creates "failure zones", not that I expect it, but as an engineer, I plan for the worse and hope for the best, not the other way around :-)

The code I propose is only a proof of concept for you to consider. The code you posted currently establishes a connection per message, which is heavy on the host and does not ensure the delivery of all messages when a failure is encountered.

My example tries to resend messages, but it is not complete as it does not consider when to decide to give up (as the backlog is too large or to report the issue) and does not report the issue to any monitoring platform for action either.

And yes, you are right, I forgot the close (which should indeed never fail).

milhauzindahauz commented 3 weeks ago

Honestly it's hard to go through knowledge base (wiki, issues, etc.) to find necessary piece of information. On the other hand I must admit you are swift with your responses in general, which is amazing and I would like to thank you for that.

Having 150 peers for a single core can be fine if you are not handling a full routing table. However, the CPU usage is proportional to the number of peers multiplied by the load per peer, which can lead to one peer affecting another. While such large setups are not uncommon, it was not the scale I had in mind when designing ExaBGP; I would not have used Python for such a large setup.

I want to use exabgp as bgp speaker which announce and withdraw routes for RTBH purpose. I have no need for RIB knowledge. So I added adj-rib-in false; and adj-rib-out false; to each peer config

The signal code in ExaBGP is the source of the 'last' long-lasting opened issue, which has a significant impact on the system. It was designed before the API, and given the choice, I would remove the feature. Instead, I would add an 'add peer'/'remove peer' API call, which has not been done yet.

Based on that... I am implementing an observer which have feedback loop (socket server) it handles the messages from the exabgp. It also listens for peer additions and deletions so I want to restart the exabgp when peers are changed in the config. If you can point me to better direction based on your knowledge feel free to do it.

The code I propose is only a proof of concept for you to consider. The code you posted currently establishes a connection per message, which is heavy on the host and does not ensure the delivery of all messages when a failure is encountered.

My example tries to resend messages, but it is not complete as it does not consider when to decide to give up (as the backlog is too large or to report the issue) and does not report the issue to any monitoring platform for action either.

I am still missing the point how can backlog grow to large number. I will have the socket server always running before the exabgp as it is part of the observer service.

thomas-mangin commented 3 weeks ago

Like many open-source projects, this is a one-man-band effort, and documentation is lacking.

For RTBH, disabling the RIB is good in your case, and it should be perfectly fine to have 100-200 peers. Please keep in mind the code is async.

The simplest way to deal with peer addition and removal is to configure graceful-restart on both sides of the BGP connection, add the peer to the configuration file, and restart ExaBGP.

You could have quite a long (minutes to hours) timeout for the route - it up to you. You can always remove them with a withdraw command.

If you have some code to re-announce the same routes, the Adj-RIB will be re-validated and the RIB/FIB will remain unchanged. The router load following the restart will be unnoticeable.

I will have the socket server always running before the exabgp as it is part of the observer service.

This very optimistic statement may not survive many years in production.

thomas-mangin commented 3 weeks ago

Hi @milhauzindahauz - Any more questions, or can we close this issue?

milhauzindahauz commented 3 weeks ago

Hi @thomas-mangin. I think the necessary part was cleared. Thank you very much for the feedback and answers to my questions.