Exa-Networks / exabgp

The BGP swiss army knife of networking
Other
2.07k stars 445 forks source link

No wiki or issue regarding multihop set up #1238

Open milhauzindahauz opened 6 days ago

milhauzindahauz commented 6 days ago

Describe the bug I tried to setup multihop. The exabgp report it to me as invalid keyword

10:28:52 | 17032  | configuration   | syntax error in section neighbor
10:28:52 | 17032  | configuration   | line 7: multihop ;
10:28:52 | 17032  | configuration   |
10:28:52 | 17032  | configuration   | invalid keyword "multihop"
cat /etc/exabgp/exabgp.conf
neighbor 178.100.0.2 {
    peer-address 178.100.0.2;
    multihop;
    local-address 172.26.0.1;
    local-as 65000;
    api {
        processes [control_api_178_100_0_2];
        neighbor-changes;
        send {
            parsed;
            open;
            update;
            notification;
            keepalive;
                }
                receive {
            parsed;
            open;
            update;
            notification;
            keepalive;
                }
    }

    md5-password router1_password;
    peer-as 65004;
    router-id 178.100.0.2;
    adj-rib-in false;
    adj-rib-out false;
    capability {
        asn4 enable;
        graceful-restart 1200;
    }
}
cat /etc/frr/frr.conf
frr defaults traditional
log syslog informational
log file /tmp/frr.log

router bgp 65004
  bgp router-id 178.100.0.2
  bgp log-neighbor-changes
  neighbor 172.26.0.1 remote-as 65000
  neighbor 172.26.0.1 ebgp-multihop
  neighbor 172.26.0.1 password router1_password
 exabgp --version
ExaBGP : 4.2.22
Python : 3.10.14 (main, Apr 24 2024, 14:35:38) [GCC 13.2.1 20240316 (Red Hat 13.2.1-7)]
Uname  : Linux  6.10.10-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Sep 12 18:26:09 UTC 2024 x86_64
Root   : /usr/local

Expected behavior

This router is capable to establish peering with gobgp

cat gobgp.toml
[global.config]
router-id = "172.26.0.1"
as = 65000

[[neighbors]]
[neighbors.config]
neighbor-address = "178.100.0.2"
peer-as = 65004
auth-password = "router1_password"
# vtysh -c 'show bgp sum'

IPv4 Unicast Summary:
BGP router identifier 178.100.0.2, local AS number 65004 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 1, using 17 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
172.26.0.1      4      65000         2         3        0    0    0 00:00:05     (Policy) (Policy) N/A

Total number of neighbors 1
thomas-mangin commented 6 days ago

Disabling multi-hop is a vendor feature. It limits the TCP packets' ability to pass through routers by setting the packet TTL to 1, not allowing it to be forwarded. Nothing is required for multihop to work with ExaBGP - we do not have this feature.

milhauzindahauz commented 5 days ago

I missed the part that I had to set up multihop as part of its configuration to get it working, that is what is confusing me. The issue is the exabgp can not establish peering in the same lab. I can not found out why. Only piece of info I got is:

{ "exabgp": "4.0.1", "time": 1727420906.968853, "host" : "laptop", "pid" : 1488, "ppid" : 1486, "counter": 30, "type": "state", "neighbor": { "address": { "local": "172.26.0.1", "peer": "178.100.0.2" }, "asn": { "local": 65000, "peer": 65004 } , "state": "down", "reason": "peer reset, message (closing connection) error(Broken TCP connection)" } }
thomas-mangin commented 5 days ago

I would first try to disable md5 between the routers.

milhauzindahauz commented 5 days ago

I removed that.

09:44:36 | 39383  | outgoing-13     | outgoing-13 172.26.0.1-178.100.0.2, closing connection
09:44:36 | 39383  | outgoing-13     | outgoing-13 172.26.0.1-178.100.0.2 178.100.0.2 lost TCP session with peer
09:44:36 | 39383  | outgoing-13     | peer reset, message [closing connection] error[the TCP connection was closed by the remote end]
09:44:36 | 39383  | outgoing-13     | outgoing-13 172.26.0.1-178.100.0.2, closing connection

Still confusing, because the frr in the container can establish peering and exchange information with other "router"(gobgp).

thomas-mangin commented 5 days ago

@milhauzindahauz I am not a fortune teller: You are cherry-picking what you are extracting from the logs. I need the full logs (as printed with the -d option) if you want me to have even a remote chance to perform precision guesswork on your problem's root cause.

Like .. Did you make sure the IP address you bind from exists on the server? ExaBGP does not touch the networking stack it is not a router, just a BGP application.

milhauzindahauz commented 5 days ago

Found out the root cause. Exabgp wasn't capable to use port 179 even though I had permissions in place.

getcap /usr/local/sbin/exabgp
/usr/local/sbin/exabgp cap_net_bind_service=eip
[Unit]
Description=ExaBGP
Documentation=man:exabgp(1)
Documentation=man:exabgp.conf(5)
Documentation=https://github.com/Exa-Networks/exabgp/wiki
After=network.target
ConditionPathExists=/etc/exabgp/exabgp.conf
ConditionPathExists=/usr/local/sbin/exabgp
ConditionPathExists=/appl/bin/python-exabgp/bin/
ConditionPathExists=/usr/local/etc/exabgp/exabgp.env
StartLimitIntervalSec=15

[Service]
User=user
Group=user
Environment=PYTHONUNBUFFERED=1
Environment=PATH=/appl/bin/python-exabgp/bin/:/appl/bin/python-exabgp/bin
PermissionsStartOnly=true
ExecStartPre=-/bin/mkdir -p /data/log
ExecStartPre=-chown user: /data/log
ExecStartPre=-mkfifo /run/exabgp.in
ExecStartPre=-mkfifo /run/exabgp.out
ExecStartPre=chmod 600 /run/exabgp.in
ExecStartPre=chmod 600 /run/exabgp.out
ExecStartPre=chown user: /run/exabgp.in
ExecStartPre=chown user: /run/exabgp.out
StandardOutput=journal+console
StandardError=journal+console
StandardOutput=append:/data/log/exabgp.log
StandardError=append:/data/log/exabgp_process.log
ExecStart=/usr/local/sbin/exabgp /etc/exabgp/exabgp.conf -d
ExecReload=/bin/kill -USR1
Restart=always
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

So I changed the port to 1791 and it's working like charm.

[exabgp.api]
ack = false
chunk = 1
cli = true
compact = false
encoder = json
pipename = 'exabgp'
respawn = true
terminate = false

[exabgp.bgp]
openwait = 60

[exabgp.cache]
attributes = true
nexthops = true

[exabgp.daemon]
daemonize = false
drop = true
pid = ''
umask = '0o137'
user = 'nobody'

[exabgp.log]
all = false
configuration = true
daemon = true
destination = 'stdout'
enable = true
level = INFO
message = false
network = true
packets = false
parser = false
processes = true
reactor = true
rib = false
routes = false
short = false
timers = false

[exabgp.pdb]
enable = false

[exabgp.profile]
enable = false
file = ''

[exabgp.reactor]
speed = 1.0

[exabgp.tcp]
acl = false
bind = '172.26.0.1'
delay = 0
once = false
port = 1791

Apologies Thomas for the partial logs. I wanted just to pick the necessary logs.

thomas-mangin commented 4 days ago

I am confused as when I try I get the following. Did you know why you did not you get a log entry about the binding issue? I was looking at how to help other users who may face the same issue.

at 16:40:14 ❯ env exabgp.tcp.bind=127.0.0.1 ./sbin/exabgp etc/exabgp/conf-ipself4.conf 
16:40:20 | 83053  | welcome       | Thank you for using ExaBGP
16:40:20 | 83053  | version       | 4.2.22  
16:40:20 | 83053  | interpreter   | 3.12.6 (main, Sep  6 2024, 19:03:47) [Clang 15.0.0 (clang-1500.3.9.4)]
16:40:20 | 83053  | os            | Darwin MacBook-Pro-3.local 24.0.0 Darwin Kernel Version 24.0.0: Mon Aug 12 20:51:54 PDT 2024; root:xnu-11215.1.10~2/RELEASE_ARM64_T6000 arm64
16:40:20 | 83053  | installation  | /Users/thomas/Code/github.com/exa-networks/exabgp/main
16:40:20 | 83053  | advice        | environment file missing
16:40:20 | 83053  | advice        | generate it using "exabgp --fi > /Users/thomas/Code/github.com/exa-networks/exabgp/main/etc/exabgp/exabgp.env"
16:40:20 | 83053  | cli control   | named pipes for the cli are:
16:40:20 | 83053  | cli control   | to send commands  /Users/thomas/Code/github.com/exa-networks/exabgp/main/run/exabgp.in
16:40:20 | 83053  | cli control   | to read responses /Users/thomas/Code/github.com/exa-networks/exabgp/main/run/exabgp.out
16:40:20 | 83053  | network       | can not bind to 127.0.0.1:179, you may need to run ExaBGP as root
16:40:20 | 83053  | network       | unset exabgp.tcp.bind if you do not want listen for incoming connections
16:40:20 | 83053  | network       | and check that no other daemon is already binding to port 179`