nutdotnet / WinNUT-Client

WinForms-based client for monitoring your Uninterruptible Power Supply (UPS) connected to a Network UPS Tools (NUT) server.
GNU General Public License v3.0
206 stars 19 forks source link

Connected pop up not appearing after reconnecting in some situations #182

Open gbakeman opened 1 month ago

gbakeman commented 1 month ago

On Win10 I got a single notification when the nut server stopped. "Reconnection In Progress\nReconnection..." To be perfect a second notification an hour later when the nut server restarted would be nice.

When I look at the recent messages drop down list in WinNUT Client there are three messages.

  1. Lost Connect to rpi:3493 (should it say "Lost Connection to rpi:3493"?)
  2. Reconnection In Progress (should the notification say "Lost Connection: Reconnection In Progress" as no "Connection lost" notification was given?)
  3. Connection to Nut Host rpi:3493 Established (should this GUI message also be sent as a Windows notification?)

Originally posted by @yoyoma2 in https://github.com/nutdotnet/WinNUT-Client/issues/65#issuecomment-2267559170

Here's what I see on Win10. Only one notification for the entire backup process starting at 7:42 AM ending an hour later. It was a little surprising to see "reconnecting" without any "Lost connection" text. Notifications

Points of possible interest:

  • I'm on Win10 not Win11

  • The RaspiBackup script stops the nut server service during the backup but the ethernet connection to the rpi is never broken (different code path than disabling network adapter?)

  • The WinNUT client GUI has the "Connection to NUT Host rpi:3493 Established" message precisely after the backup but no Windows notification at all

  • My "Create LogFile" setting was off (Next backup I'll enable the log file)

  • The Miscellaneous tab title has a typo (missing 'e')

So things are actually working perfectly but the re-connection notification is missing.

I backed up the rpi (which stops the NUT server for an hour) with Debug level log enabled in the WinNUT client. The whole time WinNUT client was closed to systray. Here's are the notifications: notifications2

Description of log:

  • Line 2156 (2:45:37 PM the backup starts and we get a reconnection in progress notification)

  • Line 4414 (3:47:25 PM the backup ends and there is no notification) This is bad!

  • Line 8710 (4:08:13 PM I manually turn off logging and Not Connected, Connected notifications appear)

Here is the log. WinNUT-Client-2024-08-14.zip

The connection being reestablished is somehow not being reported even though it is happening.

Log excerpt:

8/14/2024 3:47:18 PM [14084, UPS_Device]: Attempting reconnection...
8/14/2024 3:47:18 PM [14084, UPS_Device]: Beginning connection: [], Name: [][AutoReconnect]
8/14/2024 3:47:18 PM [14084, Nut_Socket]: Attempting TCP socket connection to []:3493...
8/14/2024 3:47:23 PM [14084, UPS_Device]: Attempting reconnection...
8/14/2024 3:47:23 PM [14084, UPS_Device]: Beginning connection: [], Name: [][AutoReconnect]
8/14/2024 3:47:23 PM [14084, Nut_Socket]: Attempting TCP socket connection to rpi:3493...
8/14/2024 3:47:25 PM [14084, Nut_Socket]: Connection established and streams ready.
8/14/2024 3:47:25 PM [14084, Nut_Socket]: Gathering basic info about the NUT server...
8/14/2024 3:47:25 PM [14084, Nut_Socket]: Server version: []
8/14/2024 3:47:25 PM [14084, Nut_Socket]: Protocol version: []
8/14/2024 3:47:25 PM [14084, Nut_Socket]: Completed gathering basic info about NUT server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: Retrieving basic UPS product information...
8/14/2024 3:47:25 PM [14084, UPS_Device]: Initializing other well-known UPS variables...
8/14/2024 3:47:25 PM [14084, UPS_Device]: output.current is not supported by server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: output.voltage is not supported by server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: output.realpower is not supported by server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: Determining best method to calculate power usage...
8/14/2024 3:47:25 PM [14084, UPS_Device]: ups.realpower is not supported by server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: Using RPNomLoadPct method.
8/14/2024 3:47:25 PM [14084, UPS_Device]: battery.capacity is not supported by server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: Apply Fallback Value when retrieving battery.capacity
8/14/2024 3:47:25 PM [14084, UPS_Device]: output.frequency.nominal is not supported by server.
8/14/2024 3:47:25 PM [14084, UPS_Device]: Apply Fallback Value when retrieving output.frequency.nominal
8/14/2024 3:47:25 PM [14084, UPS_Device]: Completed retrieval of basic UPS product information.
8/14/2024 3:47:25 PM [14084, WinNUT]: []has indicated it's ready to start sending data.
8/14/2024 3:47:25 PM [14084, WinNUT]: Update Icon
8/14/2024 3:47:25 PM [14084, WinNUT]: NotifyIcon Text => 
WinNUT Client - 2.3
Connected
8/14/2024 3:47:25 PM [14084, WinNUT]: Connection to Nut Host Established
8/14/2024 3:47:25 PM [14084, String]: New Log to CB_Current Log : Connection to Nut Host [] Established
8/14/2024 3:47:25 PM [14084, UPS_Device]: Nut Host Reconnected
8/14/2024 3:47:30 PM [14084, UPS_Device]: Enter Retrieve_UPS_Datas

Connection code seems to be running OK, with WinNUT processing events. Unclear since there's no logging code if the toast trigger is running.

yoyoma2 commented 5 days ago

If I stop nut-server on the rpi and start it a minute later I see this in the Win-NUT logs after the connection is reestablished. The last line saying "UPS statuses have not changed since last update, skipping" is suspicious.

9/24/2024 12:31:01 PM [3344, WinNUT]: NotifyIcon Text => 
WinNUT Client - 2.3
Connected
9/24/2024 12:31:01 PM [3344, WinNUT]: Connection to Nut Host Established
9/24/2024 12:31:01 PM [3344, String]: New Log to CB_Current Log : Connection to Nut Host rpi:3493 Established
9/24/2024 12:31:01 PM [3344, UPS_Device]: Nut Host Reconnected
9/24/2024 12:31:06 PM [3344, UPS_Device]: Enter Retrieve_UPS_Datas
9/24/2024 12:31:06 PM [3344, UPS_Device]: input.frequency is not supported by server.
9/24/2024 12:31:06 PM [3344, UPS_Device]: Apply Fallback Value when retrieving input.frequency
9/24/2024 12:31:06 PM [3344, UPS_Device]: output.voltage is not supported by server.
9/24/2024 12:31:06 PM [3344, UPS_Device]: Apply Fallback Value when retrieving output.voltage
9/24/2024 12:31:06 PM [3344, UPS_Device]: UPS statuses have not changed since last update, skipping.

I cannot see in UPS_Device.vb where variable oldStatusBitmask gets changed when Socket_Broken happens. Shouldn't some sort of flag be set so that the notification of the connection being reestablished is not skipped?

gbakeman commented 5 days ago

Hi,

Thanks for giving the latest build another try and providing feedback! And also, thanks for diving in to the code to see if you could find out what's wrong.

With regards to the status updating, I don't think that's a big concern since this is the only method that acts on the status changing, which is only to handle a few cases of the UPS switching between line power and battery, as well as a full shutdown signal. I'm not sure I see any reason to clear out/reset the old status bitmask during a reconnection event, but please let me know if I'm missing something.

I think the log data you provided may have cut out the new logging statements I had added to the ToastNotifyIcon method, which I think still should've run. Could you please upload more of the log in case those messages are still there?

yoyoma2 commented 5 days ago

I'm running the latest pre-release v2.3.8988 not the debug build. Machines on a UPS are important so a pre-release is already pushing it and a debug build is not possible. Is the extra logging code merged?

I don't know the Win-NUT code at all...

If you stop the NUT server and start it again 1 min later do you get actually get a Windows notification from Win-NUT when the connection is lost? Do you get actually get a Windows notification from Win-NUT when the connection is restored? I don't think my setup is special. I only get a Reconnection In Progress notification when the connection is lost.

gbakeman commented 5 days ago

It is odd, I've tried simulating the server going down by disabling the network adapter on my PC, but maybe there's some special nuance to actually shutting down the NUT server. I just can't seem to reproduce the issue you've been facing, so I was hoping the extra logging I'd added might help in getting there.

I totally understand the desire for stable software and how it would be uncomfortable running anything else for an extended period of time. If you could consider running this debug build just for a few minutes to grab a log file though, I think that could potentially help here. And sorry, I completely forgot to link it! If this is something you.could do, here's the build in question: https://github.com/nutdotnet/WinNUT-Client/actions/runs/10531599982

yoyoma2 commented 5 days ago

I installed Visual Studio Community and hacked something that made the "Connected" notification appear when the reconnection occurs. I added a log message if we are logging in twice and it was showing up in the log file so I tried a hack to avoid logging in again while reconnecting and it works.

I don't know the code so there must be a better solution to handle the thrown exception but this should provide useful information as to what is happening.

$ git diff WinNUT-Client_Common/UPS_Device.vb
diff --git a/WinNUT_V2/WinNUT-Client_Common/UPS_Device.vb b/WinNUT_V2/WinNUT-Client_Common/UPS_Device
.vb
index 88595e7..ceeb847 100644
--- a/WinNUT_V2/WinNUT-Client_Common/UPS_Device.vb
+++ b/WinNUT_V2/WinNUT-Client_Common/UPS_Device.vb
@@ -159,7 +159,12 @@ Public Class UPS_Device
     End Sub

     Public Sub Login()
+        If IsReconnecting = True Then
+            Return
+        End If
+
         If Not IsConnected OrElse IsLoggedIn Then
+            LogFile.LogTracing("UPS is in an invalid state to login.", LogLvl.LOG_ERROR, Me)

             Throw New InvalidOperationException("UPS is in an invalid state to login.")
         End If
gbakeman commented 5 days ago

It's possible the socket isn't resetting its logged in state when disconnected. I'll take a closer look later today and hopefully have another debug build up for you soon.

yoyoma2 commented 4 days ago
  1. If an ethernet cable is pulled out then quickly plugged back in, the same NUT server instance is still there.
  2. If raspibackup stops the NUT server and starts it an hour later it's a new NUT server instance.

Does the login on reconnect strategy need to handle both scenarios differently or does the NUT server consider any brief lost connection to be logged out? How come my hack of avoiding the second login work at all? To the new server instance Win-NUT has never logged in at all...

BTW the "Connected" toast debug log messages didn't appear in the case where the exception was thrown on double login. With the hack to avoid double login, the toast debug log messages appear (as do the Windows notifications).

Looking forward to your next build. Sorry for the noob questions, I like to learn stuff...

gbakeman commented 4 days ago

No worries at all about the questions, I'm glad you're curious and I appreciate everything you've done so far.

A little background about logging in to NUT servers - it's actually not necessary a vast majority of the time and should only be done under specific circumstances. I'm tracking this info in #106 if you want a closer look. I made some big changes with connection code recently in #163 so that the Login command started being correctly called, and made it so that it's only called in the main WinNUT connection routine as long as the provided username is not empty. This is why I'm confused that your fix worked; the Login attempt is only made at the very end of the connected event logic (so the notifications should have been triggered already), and it shouldn't have interrupted anything anyways. Looking back at your last full log, the point where the notificaiton toast was supposed to appear still shows even though it looks like the client didn't attempt to login when your RPi came back up (also a mystery...)

BTW the "Connected" toast debug log messages didn't appear in the case where the exception was thrown on double login. With the hack to avoid double login, the toast debug log messages appear (as do the Windows notifications).

That's very odd to me. I want to try testing with my own VM instance of NUT running to fully emulate your situation, although I may not be able to get to that for a few days. You also mention this exception which isn't present in your last full log as well - if you could please upload a debug log of the build linked above for me like you were doing before, that should help me get closer to the root of the issue. Thanks again for everything, and please keep asking more questions!

yoyoma2 commented 4 days ago

I've cleared the "Login" text field in the latest pre-release which I use in "production" and the "Connected" notification now appears when the NUT server is restarted. I'll leave it that way for now but I'm still very interested in testing your next build.

The additional log message above (or similar) that I added to the Login subroutine may be a good idea as it would have made identifying that this was login related much easier.

gbakeman commented 4 days ago

I'd really appreciate it if you could upload a recent complete log file with this exception you've been mentioning as that will help me a ton.

yoyoma2 commented 4 days ago

Here's my only debug log from my build that includes one "UPS is in an invalid state to login." message that I added. I then immediatelty added the hack so this exception message never happened again. Although my git pull was from the main branch I had manually added your pending pull request so the toast debug stuff is in there in addition to my change.

WinNUT-Client-2024-09-25.zip

Hope this helps.

gbakeman commented 3 days ago

That helped a lot, thank you! I was finally able to reproduce the issue you were experiencing. It's possible my client wasn't even logging in on my other NUT server which explains why I wasn't seeing the issue. Now that I've setup a test NUT server for logging in, I've been able to understand that process a little better.

It was pretty much exactly as you thought. An exception was being thrown because WinNUT thought it was already logged in. When that happened, further event handlers in WinNUT (like the notification toast) were skipped. Your hack worked because it avoided the exception like you were thinking. I've approached the problem a little differently, in that I have WinNUT properly unset the logged in flag when disconnecting so it will always sync up now and allow WinNUT to log back in when reconnecting. And from what I can tell, the NUT server doesn't care how the client disconnects, so you can always log back in again.

If you could please, I'd like for you to once again test the newest debug build just to confirm that it works as expected. No need to provide any log info if everything is good. Once you can confirm for me, I'll merge and close this issue, then hopefully have a new pre-release up soon.

Thanks again for all of your assistance with this, I'm sorry it ended up being such a simple problem!

yoyoma2 commented 3 days ago

Not easy to notice a failed login when things work anyways without logging in...

It works beautifully! I get the "Connected" notification at the right time and I even checked in the logs that the authentication is successful just before the reconnected message.

09/26/2024 20:03:12 [18028, Nut_Socket]: Authenticated successfully.
09/26/2024 20:03:12 [18028, UPS_Device]: Nut Host Reconnected

It's cool when everything makes sense in the end. I'll use the pre-release build in production when available and reinsert the previously cleared login name. Maybe leave a comment here about the new build when available so I get an email.

Very pleasant cooperating with you.

gbakeman commented 2 days ago

As I merged these changes, I realized that I have the main development branch half-way through implementing a new deployment system for WinNUT. If you can hang on for a little bit longer, I'd like to get this out of the way so future updates are easier.

yoyoma2 commented 2 days ago

No rush. The latest pre-release is doing the job in not logged in mode. When the next pre-release comes out I'll try the logged in mode to be consistent with my linux NUT clients and to test your fix more thoroughly.