Closed milhauzindahauz closed 3 weeks ago
When exabgp is provided with multiple configuration files, it will fork an instance per file and run it. Each process is fully independent and can not access the resources of the other. You will also have a problem if you use the CLI as they will both be attempting to access the same underlying pipe.
While not documented, this limitation becomes logical once you understand what is in the background. However, it's understandable that this concept may not be immediately clear.
To do what you want, you can:
It's important to note that when using a single configuration file, exabgp will only utilise a single core, as opposed to the 2 cores used when forking.
To preempt a question I was already asked, I am not looking at adding configuration templating to exabgp. There are good external tools for it, and the parsing code is complex enough as it is—sorry.
for your script what you want is more likely to be something like:
import socket
import sys
s = None
backlog = []
def main():
while True:
line = sys.stdin.readline().strip()
if not any(line):
sys.exit(0)
backlog.append(line + '\n')
# if it grows too much perhaps better bomb out?
if s is None:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("", 1790))
try:
while backlog:
sending = backlog[0]
s.sendall((sending).encode())
backlog = backlog[1:]
except Exception:
s = None
exit(5)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
pass
When exabgp is provided with multiple configuration files, it will fork an instance per file and run it. Each process is fully independent and can not access the resources of the other. You will also have a problem if you use the CLI as they will both be attempting to access the same underlying pipe.
While not documented, this limitation becomes logical once you understand what is in the background. However, it's understandable that this concept may not be immediately clear.
I noticed this behavior. But I assumed that CLI is aware of it and can handle this. Apologies for misunderstanding.
My reasoning to split peers configs to individual files is based on following
Reloading large configuration using signal is not recommended as the configuration parsing code is currently blocking (as well as some part of the RIB required for the reload - for simplicity).
Large configuration file is pretty vague term. I will have 100-150 peers in production environment. And I don't know if the config generate for that qualifies as large one. So I prematurely optimized it.
To preempt a question I was already asked, I am not looking at adding configuration templaing to exabgp. There are good external tools for it, and the parsing code is complex enough as it is—sorry.
Nah buddy, keep your focus as it is. Generate config by Jinja is quite simple and straight forward.
...
except Exception:
if s is not None:
s.close()
s = None
exit(5)
I assume to close IO is good habit to have.
But can you clarify me what is the reasoning for using backlog buffer? If there is line from input it's always one line which is afterwards sent through the socket. What am I missing?
No apology was necessary; the use case for the fork feature was very specific to one user and was created before the CLI. I never considered how the two features could/would conflict. It is possible to fix it; it is "just" not implemented.
The signal code in ExaBGP is the source of the 'last' long-lasting opened issue, which has a significant impact on the system. It was designed before the API, and given the choice, I would remove the feature. Instead, I would add an 'add peer'/'remove peer' API call, which has not been done yet.
Having 150 peers for a single core can be fine if you are not handling a full routing table. However, the CPU usage is proportional to the number of peers multiplied by the load per peer, which can lead to one peer affecting another. While such large setups are not uncommon, it was not the scale I had in mind when designing ExaBGP; I would not have used Python for such a large setup.
Having multiple ExaBGP is fine and good as it creates "failure zones", not that I expect it, but as an engineer, I plan for the worse and hope for the best, not the other way around :-)
The code I propose is only a proof of concept for you to consider. The code you posted currently establishes a connection per message, which is heavy on the host and does not ensure the delivery of all messages when a failure is encountered.
My example tries to resend messages, but it is not complete as it does not consider when to decide to give up (as the backlog is too large or to report the issue) and does not report the issue to any monitoring platform for action either.
And yes, you are right, I forgot the close
(which should indeed never fail).
Honestly it's hard to go through knowledge base (wiki, issues, etc.) to find necessary piece of information. On the other hand I must admit you are swift with your responses in general, which is amazing and I would like to thank you for that.
Having 150 peers for a single core can be fine if you are not handling a full routing table. However, the CPU usage is proportional to the number of peers multiplied by the load per peer, which can lead to one peer affecting another. While such large setups are not uncommon, it was not the scale I had in mind when designing ExaBGP; I would not have used Python for such a large setup.
I want to use exabgp as bgp speaker which announce and withdraw routes for RTBH purpose. I have no need for RIB knowledge. So I added adj-rib-in false;
and adj-rib-out false;
to each peer config
The signal code in ExaBGP is the source of the 'last' long-lasting opened issue, which has a significant impact on the system. It was designed before the API, and given the choice, I would remove the feature. Instead, I would add an 'add peer'/'remove peer' API call, which has not been done yet.
Based on that... I am implementing an observer which have feedback loop (socket server) it handles the messages from the exabgp. It also listens for peer additions and deletions so I want to restart the exabgp when peers are changed in the config. If you can point me to better direction based on your knowledge feel free to do it.
The code I propose is only a proof of concept for you to consider. The code you posted currently establishes a connection per message, which is heavy on the host and does not ensure the delivery of all messages when a failure is encountered.
My example tries to resend messages, but it is not complete as it does not consider when to decide to give up (as the backlog is too large or to report the issue) and does not report the issue to any monitoring platform for action either.
I am still missing the point how can backlog grow to large number. I will have the socket server always running before the exabgp as it is part of the observer service.
Like many open-source projects, this is a one-man-band effort, and documentation is lacking.
For RTBH, disabling the RIB is good in your case, and it should be perfectly fine to have 100-200 peers. Please keep in mind the code is async.
The simplest way to deal with peer addition and removal is to configure graceful-restart on both sides of the BGP connection, add the peer to the configuration file, and restart ExaBGP.
You could have quite a long (minutes to hours) timeout for the route - it up to you. You can always remove them with a withdraw command.
If you have some code to re-announce the same routes, the Adj-RIB will be re-validated and the RIB/FIB will remain unchanged. The router load following the restart will be unnoticeable.
I will have the socket server always running before the exabgp as it is part of the observer service.
This very optimistic statement may not survive many years in production.
Hi @milhauzindahauz - Any more questions, or can we close this issue?
Hi @thomas-mangin. I think the necessary part was cleared. Thank you very much for the feedback and answers to my questions.
Describe the bug Running exabgp, I am not able to announce routes to certain neighbors. In this example it's neighbor 10.90.0.3
I had a look at issues
To Reproduce I am running exabgp as following:
config diff
I am trying to announce routers to the neighbors using exabgpcli.
10.90.0.2.conf
10.90.0.3.conf
notifier.py
Expected behavior Announce to neighbor should be done succesfully by exabgpcli.
Environment (please complete the following information):
DEBUG OUTPUT