Open jangellx opened 4 years ago
It seems that the issue is a bit different than I thought. The FIN_WAIT_1 timeouts were because I shutdown homebridge, and the OS is waiting for the sockets to free. If I run homebridge again, after about a day I see this in the netstat output (note this is a small subset of the results of netstat):
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
...
tcp4 0 0 192.168.1.231.56896 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.56876 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.56856 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.56841 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.56823 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.56775 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55772 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55758 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55745 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55732 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55716 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55702 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55656 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55641 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55625 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.55612 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54896 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54498 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54481 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54449 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54418 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54403 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54387 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54370 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54353 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54336 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54322 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54307 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54293 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54278 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54262 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54247 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54233 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54219 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54206 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54193 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54180 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54166 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54119 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54106 192.168.1.220.http ESTABLISHED
tcp4 0 0 192.168.1.231.54092 192.168.1.220.http ESTABLISHED
The device is powered, and is still configured to connect to the old SSID that I have running on an old outer in the corner of the a room, which may be resulting in an intermittent connection. However, I don't see why it would need so many connections to that device. Shouldn't it close the old socket before trying to open a new one?
Thanks!
-- Joe
I hit this recently when I changed my wifi network name, but didn't get around to updating the 15 or so HTTP devices I have to the new SSID yet. All these devices are set to do "realtime" updates with the default interval. homebridge is on a Mac running macOS 10.15.5.
Here's an example of a device definition, which uses static IPs:
After about a day of not being able to connect to any of these devices, I found that nothing on my computer could make any HTTP connections anymore, including Safari, Chrome, Slack, App Store, etc. I finally ran netstat and found that a ridiculous number of connections were open, all targeting the IP addresses of these devices. Every one of them was in the FIN_WAIT_1 state, indicating that they were waiting for a "connection closed" response from the other side. But since they never connected in the first place, this attempt to gracefully close the socket would never get a reply. This meant it was just waiting for the 10 minute timeout before closing the connection.
The reason this is an issue is that "realtime" is checking for status updates frequently, like every 30 seconds or something. So these connections would fail, but the socket would stay open until the 10 minute FIN_WAIT_1 timeout was hit. However, it doesn't wait for the timeout before trying to connect again 30 seconds later. As such, it's allocating a socket every 30 seconds and only closing them every 10 minutes. This would eventually use all 16k available sockets for that port range across the system (apparently this happens around 24 hours for about 15 devices), and thus no more connections would be allowed by anyone at all.
Is it possible for the socket to be immediately hard closed in scenarios where the connection cannot be made to the target device? That would solve this problem nicely.
Thanks!