emacs-circe / circe

Circe, a Client for IRC in Emacs
GNU General Public License v3.0
390 stars 51 forks source link

Reconnect even if some server is not available #372

Closed FrostyX closed 3 years ago

FrostyX commented 4 years ago

This is quite difficult to debug and pinpoint the exact root of the issue but my theory is following.

If there is an IRC server that is available only via VPN (or for some other reason it is not available at all times), then after disconnecting from that VPN (I do that concurrently with reconnecting to a different wireless network connection) you won't get automatically reconnected to IRC anymore. Running M-x circe-reconnect manually workarounds the issue though.

The problem is that

1) circe-lagmon-server-check is never called from circe-lagmon-timer-tick because (irc-connection-state circe-server-process) returns 'connecting' forever

2) irc-send-raw ends with an error because the server is not available and because of the error, other servers are not even tried. Therefore circe-lagmon-last-send-time is never set to anything and therefore we won't ever timeout.

This commit seems to fix the issue and hopefully it doesn't break any current use-case.

FrostyX commented 4 years ago

I finished the implementation this afternoon so I didn't have a chance to really test it in a daily-use. I wanted to share the code with you as soon as possible but I guess it would be a good idea to keep it here for a week or so, so we can be sure it really works as it should.

FrostyX commented 4 years ago

This is not sufficient. This night I got disconnected for some reason and circe didn't reconnect even with this patch.

It didn't go through this condition

(when (eq (irc-connection-state circe-server-process) 'registered)

because the state of all servers was 'connecting. Therefore changing the condition to

(when (memq (irc-connection-state circe-server-process) '(registered connecting))

helped and circe immediately reconnected. It is not ideal though because now my VPN-only server spams following error

Error running timer ‘circe-lagmon-timer-tick’: (error "Process irc.on.secret.address.com<4> not running")

Any idea how to fix the issue while satisfying both use cases?

wasamasa commented 4 years ago

I'd add an extra condition to circe-lagmon-timer-tick that checks whether the process is still running. Maybe even go as far as removing the timer function if it's not. It smells like a bug though that it's possible to have multiple dead processes with active timers for them.

FrostyX commented 3 years ago

As previously discussed, the issue is not that fun to reproduce, and at this point, I am not abroad, so the internet connection is fine, and finishing this PR doesn't have a priority for me.

I am going to close this zombie PR.