Open milhauzindahauz opened 1 month ago
Disabling multi-hop is a vendor feature. It limits the TCP packets' ability to pass through routers by setting the packet TTL to 1, not allowing it to be forwarded. Nothing is required for multihop to work with ExaBGP - we do not have this feature.
I missed the part that I had to set up multihop as part of its configuration to get it working, that is what is confusing me. The issue is the exabgp can not establish peering in the same lab. I can not found out why. Only piece of info I got is:
{ "exabgp": "4.0.1", "time": 1727420906.968853, "host" : "laptop", "pid" : 1488, "ppid" : 1486, "counter": 30, "type": "state", "neighbor": { "address": { "local": "172.26.0.1", "peer": "178.100.0.2" }, "asn": { "local": 65000, "peer": 65004 } , "state": "down", "reason": "peer reset, message (closing connection) error(Broken TCP connection)" } }
I would first try to disable md5 between the routers.
I removed that.
09:44:36 | 39383 | outgoing-13 | outgoing-13 172.26.0.1-178.100.0.2, closing connection
09:44:36 | 39383 | outgoing-13 | outgoing-13 172.26.0.1-178.100.0.2 178.100.0.2 lost TCP session with peer
09:44:36 | 39383 | outgoing-13 | peer reset, message [closing connection] error[the TCP connection was closed by the remote end]
09:44:36 | 39383 | outgoing-13 | outgoing-13 172.26.0.1-178.100.0.2, closing connection
Still confusing, because the frr in the container can establish peering and exchange information with other "router"(gobgp).
@milhauzindahauz I am not a fortune teller: You are cherry-picking what you are extracting from the logs. I need the full logs (as printed with the -d
option) if you want me to have even a remote chance to perform precision guesswork on your problem's root cause.
Like .. Did you make sure the IP address you bind from exists on the server? ExaBGP does not touch the networking stack it is not a router, just a BGP application.
Found out the root cause. Exabgp wasn't capable to use port 179 even though I had permissions in place.
getcap /usr/local/sbin/exabgp
/usr/local/sbin/exabgp cap_net_bind_service=eip
[Unit]
Description=ExaBGP
Documentation=man:exabgp(1)
Documentation=man:exabgp.conf(5)
Documentation=https://github.com/Exa-Networks/exabgp/wiki
After=network.target
ConditionPathExists=/etc/exabgp/exabgp.conf
ConditionPathExists=/usr/local/sbin/exabgp
ConditionPathExists=/appl/bin/python-exabgp/bin/
ConditionPathExists=/usr/local/etc/exabgp/exabgp.env
StartLimitIntervalSec=15
[Service]
User=user
Group=user
Environment=PYTHONUNBUFFERED=1
Environment=PATH=/appl/bin/python-exabgp/bin/:/appl/bin/python-exabgp/bin
PermissionsStartOnly=true
ExecStartPre=-/bin/mkdir -p /data/log
ExecStartPre=-chown user: /data/log
ExecStartPre=-mkfifo /run/exabgp.in
ExecStartPre=-mkfifo /run/exabgp.out
ExecStartPre=chmod 600 /run/exabgp.in
ExecStartPre=chmod 600 /run/exabgp.out
ExecStartPre=chown user: /run/exabgp.in
ExecStartPre=chown user: /run/exabgp.out
StandardOutput=journal+console
StandardError=journal+console
StandardOutput=append:/data/log/exabgp.log
StandardError=append:/data/log/exabgp_process.log
ExecStart=/usr/local/sbin/exabgp /etc/exabgp/exabgp.conf -d
ExecReload=/bin/kill -USR1
Restart=always
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE
[Install]
WantedBy=multi-user.target
So I changed the port to 1791 and it's working like charm.
[exabgp.api]
ack = false
chunk = 1
cli = true
compact = false
encoder = json
pipename = 'exabgp'
respawn = true
terminate = false
[exabgp.bgp]
openwait = 60
[exabgp.cache]
attributes = true
nexthops = true
[exabgp.daemon]
daemonize = false
drop = true
pid = ''
umask = '0o137'
user = 'nobody'
[exabgp.log]
all = false
configuration = true
daemon = true
destination = 'stdout'
enable = true
level = INFO
message = false
network = true
packets = false
parser = false
processes = true
reactor = true
rib = false
routes = false
short = false
timers = false
[exabgp.pdb]
enable = false
[exabgp.profile]
enable = false
file = ''
[exabgp.reactor]
speed = 1.0
[exabgp.tcp]
acl = false
bind = '172.26.0.1'
delay = 0
once = false
port = 1791
Apologies Thomas for the partial logs. I wanted just to pick the necessary logs.
I am confused as when I try I get the following. Did you know why you did not you get a log entry about the binding issue? I was looking at how to help other users who may face the same issue.
at 16:40:14 ❯ env exabgp.tcp.bind=127.0.0.1 ./sbin/exabgp etc/exabgp/conf-ipself4.conf
16:40:20 | 83053 | welcome | Thank you for using ExaBGP
16:40:20 | 83053 | version | 4.2.22
16:40:20 | 83053 | interpreter | 3.12.6 (main, Sep 6 2024, 19:03:47) [Clang 15.0.0 (clang-1500.3.9.4)]
16:40:20 | 83053 | os | Darwin MacBook-Pro-3.local 24.0.0 Darwin Kernel Version 24.0.0: Mon Aug 12 20:51:54 PDT 2024; root:xnu-11215.1.10~2/RELEASE_ARM64_T6000 arm64
16:40:20 | 83053 | installation | /Users/thomas/Code/github.com/exa-networks/exabgp/main
16:40:20 | 83053 | advice | environment file missing
16:40:20 | 83053 | advice | generate it using "exabgp --fi > /Users/thomas/Code/github.com/exa-networks/exabgp/main/etc/exabgp/exabgp.env"
16:40:20 | 83053 | cli control | named pipes for the cli are:
16:40:20 | 83053 | cli control | to send commands /Users/thomas/Code/github.com/exa-networks/exabgp/main/run/exabgp.in
16:40:20 | 83053 | cli control | to read responses /Users/thomas/Code/github.com/exa-networks/exabgp/main/run/exabgp.out
16:40:20 | 83053 | network | can not bind to 127.0.0.1:179, you may need to run ExaBGP as root
16:40:20 | 83053 | network | unset exabgp.tcp.bind if you do not want listen for incoming connections
16:40:20 | 83053 | network | and check that no other daemon is already binding to port 179`
Describe the bug I tried to setup multihop. The exabgp report it to me as invalid keyword
Expected behavior
This router is capable to establish peering with gobgp