Closed ertaboy356b closed 5 years ago
The publisher is the one doing the binding? So the publisher is not recognizing that the connection are dead?
Yeah the publisher is doing the binding and the subscriber connects. I did the DisableTimeWait trick and that reduced a lot in the usage of file descriptors.
It seems that the connection just keeps on spawning after each published message. I'm seeing the file descriptor count using "lsof" rising to 40K. I need to restart the server at that point. With the DisableTimeWait to true, I'm seeing it hovering around 10K so I think this is a good change.
I also had it published heartbeats every minute.
Environment
Expected behaviour
Should only have one active subscriber socket connection. 10 or more is fine on a single subscriber is fine.
Actual behaviour
Creates a lot of individual connection on a single subscriber. After a while, this behavior breaks linux's file descriptor limit.
Steps to reproduce the behaviour
A standard pub-sub connection with the server deployed to a CentOS server and clients (subs) deployed in windows machines. Production has over 200 client connections. I have to restart the server every 10 hours.
I have uploaded a lsof output for checking connections. Port 51002, 51005, 51004 are publish ports. Port 51000, 50001 are req-rep ports.
for chcking.xlsx