ipbus / ipbus-software

Software that implements a reliable high-performance control link for particle physics electronics, based on the IPbus protocol
https://ipbus.web.cern.ch
GNU General Public License v3.0
22 stars 24 forks source link

1 packe<<TRUNCATED>> #290

Open Contemporaries opened 1 year ago

Contemporaries commented 1 year ago

Timeout (1000ms)occurred for receive (header)from ControlHub with URI 'xxx'. 1 packe<> Hello,may I ask what is the cause of this issue and how can it be resolved? The situation in which this issue occured was when the number of connections reached 460 and the program had been running for about half an hour. After restarting ControlHub, the issue was resolved , but the program did not run for such a long time afterwards.

tswilliams commented 1 year ago

The only circumstance I might expect this to occur in is if the ControlHub is overloaded - or if the PC that it's running on is overloaded. How many uHAL clients do you have communicating with the ControlHub, how many boards are you controlling, and how large are the reads/writes that you're performing? Also, which version of the software are you using, and on what OS?

Contemporaries commented 1 year ago

232 boards => 462 uHal clients (Two hosts use the same ControlHub) 90% CPU usage (2 physical CPUs 64 logic cores) reads/writes: average 7500±/s highest 9000±/s Version: 2.8.2 OS: CentOs 7

Screenshot 2023-04-21 131900
tswilliams commented 1 year ago

For that many boards, I'd suggest splitting the ControlHub load over a few machines - the largest system number of boards that I've routinely controlled via a single ControlHub instance is about 60 boards.

Contemporaries commented 1 year ago

Thank your reply Our system ultimately need to run about 4000~5000 boards So, if there is any way to reduce the usage CPU of ControlHub, it will be of great help to us Thanks again

Contemporaries commented 1 year ago

For that many boards, I'd suggest splitting the ControlHub load over a few machines - the largest system number of boards that I've routinely controlled via a single ControlHub instance is about 60 boards.

So is 60 boards related to the number of connections? Two hosts using the same ControlHub to connect to the same batch of Boards for example: 60boards with 120 active connections

Screenshot 2023-04-21 235853