ClassiCube / MCGalaxy

A Minecraft Classic / ClassiCube server software
GNU General Public License v3.0
175 stars 83 forks source link

Three second pause between Heartbeat retries #606

Open rdebath opened 3 years ago

rdebath commented 3 years ago

I'd pointed tHeartbeatURL at my own server and was occasionally getting three requests in quick succession (a millisecond or so apart). On the web server these connections are succeeding, but I'm getting a generic error on mcgalaxy.
(15:08:32) Failed to send heartbeat to localhost (Error getting response stream (ReadDoneAsync2): ReceiveFailure) I can't tell what the real problem is, however, it seems that putting a 3 second delay between tries allows mcgalaxy to avoid whatever the problem is on it's retry. It's also good practice to back off a little before retrying.

commit 24cb617d243991b920e32692b1a171300ccb0c63
Author: Robert de Bath <rdebath@tvisiontech.co.uk>
Date:   Sun May 9 19:07:44 2021 +0100

    Three second pause between Heartbeat retries

diff --git a/MCGalaxy/Network/Heartbeat/Heartbeat.cs b/MCGalaxy/Network/Heartbeat/Heartbeat.cs
index c22119b9a..9bf1a3e04 100644
--- a/MCGalaxy/Network/Heartbeat/Heartbeat.cs
+++ b/MCGalaxy/Network/Heartbeat/Heartbeat.cs
@@ -107,6 +107,7 @@ namespace MCGalaxy.Network {
                     }

                     lastEx = ex;
+                    Thread.Sleep(3042);
                     continue;
                 }
             }
UnknownShadow200 commented 3 years ago
  1. What webserver software is used?
  2. What version of mono are you using?

MCGalaxy already sets a 10 second timeout on HTTP requests (Heartbeat.cs), so I'm not really a fan of adding a random Thread.Sleep

rdebath commented 3 years ago

The web server is the mini-httpd package on Debian, the mono is currently Mono JIT compiler version 5.18.0.240 (Debian 5.18.0.240+dfsg-3 Wed Apr 17 16:37:36 UTC 2019) or "stable" on Debian.

BUT the ten second timeout has NO relevance to this delay, there are many errors that can cause an immediate failure of a network connection as is happening in this case. Now if the ten second delay were guaranteed that would be a different matter.

Meanwhile, I'll flip over to Mono 6.12 to see if it's happening with that version too, but even if it doesn't occur IMO you should be able to handle an immediate failure without "hammering" the web server.

PS: I've had a dozen immediate retries with Mono 6.12 and two errors. Mono JIT compiler version 6.12.0.122 (tarball Mon Feb 22 17:42:49 UTC 2021) Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com