Exa-Networks / exabgp

The BGP swiss army knife of networking
Other
2.07k stars 443 forks source link

Inability to advertise a route through API in 4.2.21 #1111

Closed gregory-mac closed 2 years ago

gregory-mac commented 2 years ago

Describe the bug

When trying to advertise a route via STDOUT, error "no neighbor matching the command" is caught:

17:11:54 | 17     | process       | command from process api : announce flow route { match { destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } } 
17:11:54 | 17     | api           | no neighbor matching the command : announce flow route { match { destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } }

To Reproduce Here are my configs:

exabgp.conf

process api {
    run /usr/local/bin/python /exabgp/process.py;
    encoder json;
}

neighbor 172.20.2.1 {
    description "cvt-netlab-acc-rtr2";
    router-id 172.20.2.100;
    local-as 49505;
    peer-as 49505;
    family {
        ipv4 unicast;
        ipv4 flow;
    }
    local-address 172.20.2.100;

    static {
        route 101.0.0.0/8 next-hop 172.20.1.100;
    }
}

process.py

#! /usr/bin/python3
from time import sleep

sleep(1)

print(f"announce flow route {{ match {{ destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; }} then {{ discard; }} }}")

while True:
    pass

Expected behavior

A route is successfully advertised.

Environment (please complete the following information):

Additional context

I've seen in the similar issue, #1108, that the problem should be fixed, however, for me it is reproducible both on CentOS 7.9 and Ubuntu 20.04.

Here's the debug log:

17:11:53 | 17     | welcome       | Thank you for using ExaBGP
17:11:53 | 17     | version       | 4.2.21  
17:11:53 | 17     | interpreter   | 3.9.13 (main, Jul 12 2022, 12:26:02)  [GCC 8.3.0]
17:11:53 | 17     | os            | Linux exabgp 3.10.0-1160.71.1.el7.x86_64 #1 SMP Tue Jun 28 15:37:28 UTC 2022 x86_64
17:11:53 | 17     | installation  | /usr/local
17:11:53 | 17     | configuration | performing reload of exabgp 4.2.21
17:11:53 | 17     | configuration | > process          | 'api'
17:11:53 | 17     | configuration | . run              | '/usr/local/bin/python' '/exabgp/process.py'
17:11:53 | 17     | configuration | . encoder          | 'json'
17:11:53 | 17     | configuration | < process          | 
17:11:53 | 17     | configuration | > neighbor         | '172.20.2.1'
17:11:53 | 17     | configuration | . description      | 'cvt-netlab-acc-rtr2'
17:11:53 | 17     | configuration | . router-id        | '172.20.2.100'
17:11:53 | 17     | configuration | . local-as         | '49505'
17:11:53 | 17     | configuration | . peer-as          | '49505'
17:11:53 | 17     | configuration | > family           | 
17:11:53 | 17     | configuration | . ipv4             | 'unicast'
17:11:53 | 17     | configuration | . ipv4             | 'flow'
17:11:53 | 17     | configuration | < family           | 
17:11:53 | 17     | configuration | . local-address    | '172.20.2.100'
17:11:53 | 17     | configuration | > static           | 
17:11:53 | 17     | configuration | . route            | '101.0.0.0/8' 'next-hop' '172.20.1.100'
17:11:53 | 17     | configuration | < static           | 
17:11:53 | 17     | configuration | < neighbor         | 
17:11:53 | 17     | reactor       | new peer: neighbor 172.20.2.1 local-ip 172.20.2.100 local-as 49505 peer-as 49505 router-id 172.20.2.100 family-allowed in-open
17:11:53 | 17     | reactor       | loaded new configuration successfully
17:11:53 | 17     | process       | forked process api
17:11:53 | 17     | reactor       | initialising connection to peer-1
17:11:53 | 17     | outgoing-1    | attempting connection to 172.20.2.1:179
17:11:53 | 17     | outgoing-1    | sending TCP payload (  57) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0039 0104 C161 00B4 AC14 0264 1C02 0601 0400 0100 0102 0601 0400 0100 8502 0641 0400 00C1 6102 0206 00
17:11:53 | 17     | outgoing-1    | >> OPEN version=4 asn=49505 hold_time=180 router_id=172.20.2.100 capabilities=[Multiprotocol(ipv4 unicast,ipv4 flow), Extended Message(65535), ASN4(49505)]
17:11:53 | 17     | ka-outgoing-1 | receive-timer 60 second(s) left
17:11:53 | 17     | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0047 01
17:11:53 | 17     | outgoing-1    | received complete TCP payload (  52) 04C1 6100 5AAC 19DC E32A 0206 0104 0001 0001 0206 0104 0001 0085 0202 8000 0202 0200 0204 4002 4078 0206 4104 0000 C161 0202 4700
17:11:53 | 17     | outgoing-1    | << message of type OPEN
17:11:53 | 17     | outgoing-1    | << OPEN version=4 asn=49505 hold_time=90 router_id=172.25.220.227 capabilities=[Multiprotocol(ipv4 unicast,ipv4 flow), Route Refresh, Graceful Restart Flags 0x4 Time 120 , ASN4(49505), Unassigned 71, Route Refresh]
17:11:53 | 17     | outgoing-1    | sending TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
17:11:53 | 17     | outgoing-1    | >> KEEPALIVE (OPENCONFIRM)
17:11:53 | 17     | ka-outgoing-1 | receive-timer 90 second(s) left
17:11:53 | 17     | ka-outgoing-1 | receive-timer 89 second(s) left
17:11:54 | 17     | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
17:11:54 | 17     | outgoing-1    | << message of type KEEPALIVE
17:11:54 | 17     | reactor       | connected to peer-1 with outgoing-1 172.20.2.100-172.20.2.1
17:11:54 | 17     | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0017 02
17:11:54 | 17     | outgoing-1    | received complete TCP payload (   4) 0000 0000
17:11:54 | 17     | outgoing-1    | << message of type UPDATE
17:11:54 | 17     | parser        | parsing UPDATE (   4) 0000 0000
17:11:54 | 17     | peer-1        | << UPDATE #1
17:11:54 | 17     | peer-1        |    UPDATE #1 nlri  (   4) eor 1/1 (ipv4 unicast)
17:11:54 | 17     | outgoing-1    | sending TCP payload (  46) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002E 0200 0000 1540 0101 0040 0200 4003 04AC 1401 6440 0504 0000 0064 0865
17:11:54 | 17     | outgoing-1    | >> 1 UPDATE(s)
17:11:54 | 17     | outgoing-1    | received complete TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 001E 02
17:11:54 | 17     | outgoing-1    | received complete TCP payload (  11) 0000 0007 900F 0003 0001 85
17:11:54 | 17     | outgoing-1    | << message of type UPDATE
17:11:54 | 17     | parser        | parsing UPDATE (  11) 0000 0007 900F 0003 0001 85
17:11:54 | 17     | peer-1        | << UPDATE #2
17:11:54 | 17     | peer-1        |    UPDATE #2 nlri  (  11) eor 1/133 (ipv4 flow)
17:11:54 | 17     | outgoing-1    | sending TCP payload (  23) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0017 0200 0000 00
17:11:54 | 17     | outgoing-1    | >> EOR ipv4 unicast
17:11:54 | 17     | outgoing-1    | sending TCP payload (  30) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 001E 0200 0000 0790 0F00 0300 0185
17:11:54 | 17     | outgoing-1    | >> EOR ipv4 flow
17:11:54 | 17     | peer-1        | >> EOR(s)
17:11:54 | 17     | process       | command from process api : announce flow route { match { destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } } 
17:11:54 | 17     | reactor       | async | api | announce flow route { match { destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } }
17:11:54 | 17     | api           | no neighbor matching the command : announce flow route { match { destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } }
17:11:55 | 17     | ka-outgoing-1 | receive-timer 89 second(s) left
17:11:55 | 17     | ka-outgoing-1 | send-timer 29 second(s) left
17:11:56 | 17     | ka-outgoing-1 | receive-timer 88 second(s) left
17:11:56 | 17     | ka-outgoing-1 | send-timer 28 second(s) left
17:11:57 | 17     | ka-outgoing-1 | receive-timer 87 second(s) left
17:11:57 | 17     | ka-outgoing-1 | send-timer 27 second(s) left
17:11:57 | 17     | ka-outgoing-1 | receive-timer 86 second(s) left
thomas-mangin commented 2 years ago

I believe this bug was introduced in 4.2.21 and fixed in 4.2.22

thomas-mangin commented 2 years ago

Could you please confirm

gregory-mac commented 2 years ago

I have tried the same configuration with the master branch, still getting an error:

root@exabgp-git:/exabgp# ./sbin/exabgp -d exabgp.conf 

14:27:34 57     welcome       Thank you for using ExaBGP
14:27:34 57     version         master-7b991093a59e786b02e4a3e12c8f223215fc2a69
14:27:34 57     location        /exabgp
14:27:34 57     python          3.9.13 (main, Jul 12 2022, 12:26:02)  [GCC 8.3.0]
14:27:34 57     platform        Linux exabgp-git 3.10.0-1160.71.1.el7.x86_64 #1 SMP Tue Jun 28 15:37:28 UTC 2022 x86_64
14:27:34 57     advice        environment file missing
14:27:34 57     advice        generate it using "exabgp env > /exabgp/etc/exabgp/exabgp.env"
14:27:34 57     cli           could not find the named pipes (exabgp.in and exabgp.out) required for the cli
14:27:34 57     cli           we scanned the following folders (the number is your PID):
14:27:34 57     cli control    - /run/exabgp/
14:27:34 57     cli control    - /run/0/
14:27:34 57     cli control    - /run/
14:27:34 57     cli control    - /var/run/exabgp/
14:27:34 57     cli control    - /var/run/0/
14:27:34 57     cli control    - /var/run/
14:27:34 57     cli control    - /exabgp/run/exabgp/
14:27:34 57     cli control    - /exabgp/run/0/
14:27:34 57     cli control    - /exabgp/run/
14:27:34 57     cli control    - /exabgp/var/run/exabgp/
14:27:34 57     cli control    - /exabgp/var/run/0/
14:27:34 57     cli control    - /exabgp/var/run/
14:27:34 57     cli control   please make them in one of the folder with the following commands:
14:27:34 57     cli control   > mkfifo /exabgp/run/exabgp.{in,out}
14:27:34 57     cli control   > chmod 600 /exabgp/run/exabgp.{in,out}
14:27:34 57     configuration performing reload of exabgp master-7b991093a59e786b02e4a3e12c8f223215fc2a69
14:27:34 57     configuration   > process          | 'api'
14:27:34 57     configuration   . run              | '/bin/bash' '/exabgp/process.sh'
14:27:34 57     configuration   . encoder          | 'json'
14:27:34 57     configuration   < process          | 
14:27:34 57     configuration   > neighbor         | '172.20.2.1'
14:27:34 57     configuration   . description      | 'cvt-netlab-acc-rtr2'
14:27:34 57     configuration   . router-id        | '172.20.2.100'
14:27:34 57     configuration   . local-as         | '49505'
14:27:34 57     configuration   . peer-as          | '49505'
14:27:34 57     configuration   > family           | 
14:27:34 57     configuration   . ipv4             | 'unicast'
14:27:34 57     configuration   . ipv4             | 'flow'
14:27:34 57     configuration   < family           | 
14:27:34 57     configuration   . local-address    | '172.20.2.100'
14:27:34 57     configuration   > static           | 
14:27:34 57     configuration   . route            | '101.0.0.0/8' 'next-hop' '172.20.1.100'
14:27:34 57     configuration   < static           | 
14:27:34 57     rib             insert 101.0.0.0/8 next-hop 172.20.1.100
14:27:34 57     configuration   < neighbor         | 
14:27:34 57     reactor         new peer: neighbor 172.20.2.1 local-ip 172.20.2.100 local-as 49505 peer-as 49505 router-id 172.20.2.100 family-allowed in-open
14:27:34 57     reactor       loaded new configuration successfully
14:27:34 57     process         forked process api
14:27:34 57     reactor         initialising connection to peer-1
14:27:34 57     outgoing-1      attempting connection to 172.20.2.1:179
14:27:34 57     outgoing-1      sending TCP payload (  57) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0039 0104 C161 00B4 AC14 0264 1C02 0601 0400 0100 0102 0601 0400 0100 8502 0641 0400 00C1 6102 0206 00
14:27:34 57     outgoing-1      >> OPEN version=4 asn=49505 hold_time=180 router_id=172.20.2.100 capabilities=[Multiprotocol(ipv4 unicast,ipv4 flow), Extended Message(65535), ASN4(49505)]
14:27:34 57     ka-outgoing-1   receive-timer 60 second(s) left
14:27:34 57     outgoing-1      received TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0047 01
14:27:34 57     outgoing-1      received TCP payload (  52) 04C1 6100 5AAC 19DC E32A 0206 0104 0001 0001 0206 0104 0001 0085 0202 8000 0202 0200 0204 4002 4078 0206 4104 0000 C161 0202 4700
14:27:34 57     outgoing-1      << message of type OPEN
14:27:34 57     outgoing-1      << OPEN version=4 asn=49505 hold_time=90 router_id=172.25.220.227 capabilities=[Multiprotocol(ipv4 unicast,ipv4 flow), Route Refresh, Graceful Restart Flags 0x4 Time 120 , ASN4(49505), Unassigned 71, Route Refresh]
14:27:34 57     outgoing-1      sending TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
14:27:34 57     outgoing-1      >> KEEPALIVE (OPENCONFIRM)
14:27:34 57     ka-outgoing-1   receive-timer 90 second(s) left
14:27:34 57     outgoing-1      received TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0013 04
14:27:34 57     outgoing-1      << message of type KEEPALIVE
14:27:34 57     reactor       connected to peer-1 with outgoing-1 172.20.2.100-172.20.2.1
14:27:34 57     rib             insert 101.0.0.0/8 next-hop 172.20.1.100
14:27:34 57     outgoing-1      received TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0017 02
14:27:34 57     outgoing-1      received TCP payload (   4) 0000 0000
14:27:34 57     outgoing-1      << message of type UPDATE
14:27:34 57     peer-1          << UPDATE #1
14:27:34 57     peer-1             UPDATE #1 nlri  (   4) eor 1/1 (ipv4 unicast)
14:27:34 57     outgoing-1      sending TCP payload (  46) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002E 0200 0000 1540 0101 0040 0200 4003 04AC 1401 6440 0504 0000 0064 0865
14:27:34 57     outgoing-1      >> 1 UPDATE(s)
14:27:34 57     outgoing-1      received TCP payload (  19) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 001E 02
14:27:34 57     outgoing-1      received TCP payload (  11) 0000 0007 900F 0003 0001 85
14:27:34 57     outgoing-1      << message of type UPDATE
14:27:34 57     peer-1          << UPDATE #2
14:27:34 57     peer-1             UPDATE #2 nlri  (  11) eor 1/133 (ipv4 flow)
14:27:34 57     outgoing-1      sending TCP payload (  23) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0017 0200 0000 00
14:27:34 57     outgoing-1      >> EOR ipv4 unicast
14:27:34 57     outgoing-1      sending TCP payload (  30) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 001E 0200 0000 0790 0F00 0300 0185
14:27:34 57     outgoing-1      >> EOR ipv4 flow
14:27:34 57     peer-1          >> all EOR(s) sent
14:27:35 57     ka-outgoing-1   receive-timer 89 second(s) left
14:27:35 57     ka-outgoing-1   send-timer 29 second(s) left
14:27:35 57     process         command from process api : announce flow route { match { source 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } } 
14:27:35 57     reactor         async | api | announce flow route { match { source 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } }
14:27:35 57     processes       no neighbor matching the command : announce flow route { match { source 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } }
14:27:35 57     process         responding to api : error
14:27:36 57     ka-outgoing-1   receive-timer 88 second(s) left
14:27:36 57     ka-outgoing-1   send-timer 28 second(s) left
14:27:37 57     ka-outgoing-1   receive-timer 87 second(s) left
14:27:37 57     ka-outgoing-1   send-timer 27 second(s) left
14:27:38 57     ka-outgoing-1   receive-timer 86 second(s) left
thomas-mangin commented 2 years ago

Sorry, I am back from holidays and failed to see the update - will look into it.

thomas-mangin commented 2 years ago

going to look into the issue now, until then just a passing comment,

while True:
    pass

is going to be eating a full core of your CPU doing nothing but looping this code, using time.sleep to stop the code may save on your energy bill 😉

thomas-mangin commented 2 years ago

To reproduce:

#! /usr/bin/python3
import sys
from time import sleep

sleep(1)

print(f"announce flow route {{ match {{ destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; }} then {{ discard; }} }}")
sys.stdout.flush()

while True:
    pass
thomas-mangin commented 2 years ago

valid configuration:

process api {
    run /opt/homebrew/bin/python3 /Users/thomas/Coding/exabgp/master/exa.run;
    encoder json;
}

neighbor 127.0.0.1 {
    description "cvt-netlab-acc-rtr2";
    router-id 127.0.0.2;
    local-as 49505;
    peer-as 49505;
    family {
        ipv4 unicast;
        ipv4 flow;
    }
    local-address 127.0.0.1;

    static {
        route 101.0.0.0/8 next-hop 127.0.0.2;
    }
    api {
        processes [ api ];
    }
}
thomas-mangin commented 2 years ago

using the name api for the api name is confusing ... but works.

thomas-mangin commented 2 years ago

And finally fix the api to run

#!/usr/bin/env python3

import sys
from time import sleep

sleep(1)

command = "announce flow route { match { destination 101.0.0.0/24; destination-port >2000&<64000; protocol tcp; } then { discard; } }"

sys.stdout.write(command + '\n')
sys.stdout.flush()

while True:
    sleep(1)
thomas-mangin commented 2 years ago

For clarity, the problem existed as the process section was not connected to the peer via the api keyword.

The script was also not flushing stdout, which means that for the test the data never made it to exabgp.

So no bugs in the code, works as intended.

gregory-mac commented 2 years ago

I see, big thanks for the help and advice!

borjam commented 1 year ago

I know it's closed, but so that you know, for completeness.

In older versions the API section seemed to be "connected" by default to all of the peers.

I was updating from 4.2.13 to 4.2.22 and I noticed my advertisements didn't work. But I had never included an api{} section inside th peers statements.

I was doing wrong, I see, but it was working until it didn't! ;)