Rem0o / FanControl.HWInfo

FanControl plugin to import HWInfo sensors.
MIT License
181 stars 26 forks source link

Error message every time my computer wakes up from sleep #29

Closed chjohans closed 1 year ago

chjohans commented 1 year ago

I get the error message "An unexpected error has occured and has been added to the log file." every single time my system wakes up from sleep. I think it has always thrown this error, exept it did't use to open the FanControl window and show the error message. Previously (a few versions ago) it would just show the error message the next time I opened the FanControl Window. But now it will maximise the window and show eh error message, see the attached screenshot.

image

This is what is logged in the lof file: 14/9/2022 18:31:11: Unhandled exception in FanControl v1.0.0.0 14/9/2022 18:31:11: System.Exception: HWInfo sensors were changed during operation at FanControl.HWInfo.HWInfoPlugin.Update() in C:\projects\fancontrol-hwinfo\HWInfoPlugin.cs:line 60 at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update() at System.Collections.Generic.List1.ForEach(Action1 action) at FanControl.Domain.ComputerAccessLayer.Update() at FanControl.Domain.ApplicationClock.DoActions() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

I'm using the HWiNFO plugin. After closing the error message all will work just fine, that is until the next time my computer goes to sleep and wakes up again. At that point, it will throw exactly the same error.

Please consider implementing some sort of "wake up from sleep" handling that will prevent this error situation.

Alternatively, a workaround could be to just throw the error in the log file, and NOT maximize the FanControl window and show an additional error message box.

A compromise could be the have a configurable option, either to show the error message as it's currently doing or to silently log the error message instead.

chjohans commented 1 year ago

Just realized I opened this for the HWiNFO plugin, it should probably have been for the main FanControl program instead. Please let me know if you want me to move/reopen the issue there.

Rem0o commented 1 year ago

The error is from the HWInfo plugin itself, so it belongs in this repository. Seems like HWInfo might not be responding when waking back from sleep, causing the plugin to lose all of its sensors.

chjohans commented 1 year ago

If by "HWinFo not responding" you mean the actual HWiNFO application, then o that is not the case. I have other sources that depend on HWiNFO and they work just fine when my computer is coming out of sleep. And it would be strange if HWiNFO should have this issue after years and years of development and testing.

And, the plugin has not lost its sensors either. When I "click away" the error message, those values that come from HWiNFO are still being updated in the FanControl window.

Would be nice if you could handle "wake from sleep" better, or at least silently log the error without popping up both the FanControl main windows and an error message window.

Rem0o commented 1 year ago

silently log the error without popping up both the FanControl main windows and an error message window.

The error is an explicit throw by the plugin. https://github.com/Rem0o/FanControl.HWInfo/blob/master/HWInfoPlugin.cs @ line 75.

That error comes from a false return from https://github.com/Rem0o/FanControl.HWInfo/blob/d5cb1eed1eb6d0dde8663f5b250539b07719cdd9/HWInfoRegistry.cs#L57

It would be nice to know why/which sensor is null upon waking back from sleep, if that's expected and so on.

chjohans commented 1 year ago

Thank you, much appreciated! I will do some more testing and try to figure this out. Seems strange if one or more of the registry values HWiNFO writes values to would be non-existent upon return from sleep.

It would also help if you had logged the actual sensor (name or index) that caused the error.

I'll do some more testing on this and see what I can find out.

Rem0o commented 1 year ago

It would also help if you had logged the actual sensor (name or index) that caused the error.

Yup that would help out.

ItsMeAubey commented 1 year ago

I too am having this issue.

What sensors are you monitoring @chjohans?

chjohans commented 1 year ago

The sensors I "Report value in Gadget" for in HWiNFO are (the values for those are updated in the windows registry):

If I add or remove sensors from HWiNFO it does not affect this error message, so I am guessing that which sensors are enabled does not make a difference.

Rem0o commented 1 year ago

My guess is that HWINFO doesn't report 0 RPM fans, here your GPU fan.

chjohans commented 1 year ago

Not so, HWiNFO does report 0 RPM fans.

And my GPU fan never goes to 0 RMP. :)

ItsMeAubey commented 1 year ago

@chjohans

The sensor we have in common is our Corsair coolers.

Does yours run off of an internal USB header? My hypothesis for now is that the pump takes a moment to power back on after the PC turns on from sleep, and there is no reading while that happens.

chjohans commented 1 year ago

@ItsMeAubey Yes, it's connected to an internal USB2 header. And, you are definitely on to something.

By default (configurable), HWiNFO polls all sensors every 2000ms, and it updates the shared memory location and the registry values (that this plugin reads) once every such cycle (I got that from the HWiNFO dev in some earlier discussion).

Exactly when HWiNFO would delete a sensor value from the registry I don't know for sure, but it only makes sense that it would happen during such an "update cycle". So even if the Corsair cooler would start a bit delayed, the registry value that this plugin read would not immediately be gone from the registry, and HWiNFO would not know that until it has gone through at least one "update cycle".

I did a quick test and it seems like my pump is starting pretty much instantly once my MB gets power after waking up from an S3 sleep state. Not "scientifically" tested, but by feeling the vibration of the pump I concluded it starts pretty much immediately when the MB gets powered on. But that does not mean that the Corsair iCue sensor senses the pump speed equally fast, I think this is the issue.

I'm using a Corsair H115i PRO RGB AiO cooler, and the pump tach cable is connected to my MB "pump" header for tach sensing, others might have theirs connected to the MB CPU header for the same purpose if their MP lacks a "Pump" header.

So I've just disabled reporting for the "Corsair H115i PRO RGB" Pump sensor in HWiNFO and enabled the "Pump" MB sensor instead. It will report an RPM twice of that the actual pump speed, so I just added a "multiply by 0.5" in my HWiNFO settings for that sensor so it will report the actual pump speed. I could have used the "Pump" sensor as detected by FanControl through the LibreHardwareMonitor library as well, but the problem is it's reporting twice the actual pump speed and this cannot be changed AFAIK - so the solution of letting HWiNFO modify the value of it's "Pump" sensor by "multiply by 0.5" is and then use that is in my opinion much better.

Voila, no more errors in FanControl when waking up from sleep.

Now it would have helped a lot in this and similar cases if the dev could add a bit more logging, eg which sensor is missing, when FanControl encounters such errors. A line of code or two should be all it takes so please consider that. Thanks for making this available for the community, I've already donated some time ago and I will donate again to help support the continued development of FanControl.

ItsMeAubey commented 1 year ago

@chjohans

Interesting discovery. Unfortunately I am using hwinfo to read my loop temperature and that is only available via hwinfo. No way to do it though motherboard.

@Rem0o would it be possible to make fancontrol refresh sensors about a second after waking from sleep? Or make just the HWINFO plugin do that? The error resolves itself if I refresh sensors but until I do that, fancontrol locks it's outputs at the last know value and everything grinds to a halt. I can refresh sensors manually every time but I'm sure you can understand how annoying that is. Maybe you could refresh a few times and then error only if the sensors don't show up again?

Thanks again for this fantastic piece of software.

chjohans commented 1 year ago

@ItsMeAubey This gets a bit messy but this should work. You could use Windows Task Scheduler to schedule two tasks, one that stops FanControl when entering sleep mode, and a second that starts FanControl (a bit delayed) when resuming from sleep.

Stop FanControl on Sleep Create a task in Task scheduler, remember to select "Run with highest privileges" Trigger - "On an event", select Log: System Source: Kernel-Power Event ID: 42 Actions - Program: taskkill.exe Add arguments: /f /im FanControl.exe

Start FanControl on Resume Create a task in Task scheduler, remember to select "Run with highest privileges" Trigger - "On an event", select Log: System Source: Kernel-Power Event ID: 107 - enable "Delay task for" 10 seconds (should be enough, but adjust as necessary) Actions - Program: cmd.exe Add arguments: C start /B FanControl.exe Start in: C:\Program Files\FanControl\

This should stop FanControl every time your computer enters sleep (or hibernate too I believe), and start FanControl again every time it resumes from sleep (or hibernation).

It should, in theory, be a working work-around :)

Oh, I'm curious, what do you use the coolant temperature for? Do you control your radiator fans with FanControl?

ItsMeAubey commented 1 year ago

Oh, I'm curious, what do you use the coolant temperature for? Do you control your radiator fans with FanControl?

Yes. I use coolant temperature and an ambient temperature sensor to do delta-t fan control. My fans are driven based off of the difference in temperature between the loop and the ambient air. No matter what temperature my house is, the fans are adjusted to keep the loop at a maximum of 6c above ambient, with the target being about 5c above ambient. This is the proper way to do water cooling - if you base your fan speed off of CPU temps you totally throw away all of the benefits of water cooling.

See here for a better explanation: https://www.overclockers.com/guide-deltat-water-cooling/

Here's my setup: image

Your workaround is interesting, but I seem to be having the error randomly as well, even when the PC has not gone to sleep at all. I just got one right now as I am writing this.


2022-09-20 4:44:09 PM: System.Exception: HWInfo sensors were changed during operation
   at FanControl.HWInfo.HWInfoPlugin.Update()
   at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update()
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at FanControl.Domain.ComputerAccessLayer.Update()
   at FanControl.Domain.ApplicationClock.DoActions()
--- End of stack trace from previous location ---
   at FanControl.Domain.ApplicationClock.<>c__DisplayClass10_0.<DoActions>b__0()
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)```
Rem0o commented 1 year ago

@chjohans

Now it would have helped a lot in this and similar cases if the dev could add a bit more logging, eg which sensor is missing, when FanControl encounters such errors.

Yup, added that https://github.com/Rem0o/FanControl.HWInfo/commit/78eb830bc0aad1d42acddf0d3d0bdf47148d8af5

ItsMeAubey commented 1 year ago

Neat, I'll try it out and report back!

chjohans commented 1 year ago

@ItsMeAubey Interesting, may I ask what is the source of your Ambient Temperature? Eg what kind of sensor do you use to measure your ambient room temp, and where is it placed?

ItsMeAubey commented 1 year ago

@chjohans

https://www.amazon.com/gp/product/B00CMR38LC

This sensor plugged into a standard two pin header on my asus mobo. It's ziptied right in front of my intake fans at the front of my case, suspended such that it cannot touch any metal.

chjohans commented 1 year ago

@ItsMeAubey Thank you! My MSI motherboard does not have any temp probe inputs. But, I do have both a "Corsair Commander Pro" and a "Corsair Cooling Link Node" laying around somewhere, both have temp probe inputs, and I'm pretty sure HWiNFO support them.

I'll experiment a bit with this, I guess adding a temp sensor for both intake air and chassis air temp will be useful.

FanControl opens up a world of creative options for fan control. :)

Thanks @Rem0o !!!

ItsMeAubey commented 1 year ago

@chjohans

Chassis temp could be cool for managing GPU temps - I base my GPU fan curves on the GPU temp but it would be best to do that based on chassis air temp for an air cooled card.

A commander pro should work perfectly. There's actually a commander plugin for fancontrol i think!

chjohans commented 1 year ago

@ItsMeAubey I have an EVGA RTX 3090, bought their AiO Hybrid cooler for it as well but still running the default air cooler. But when I do switch to the Hybrid cooler it will have to be top mounted, with dual fans push-pull but using "chassis air" so blowing air out on top. If only that loop had a temp sensor as well. Anyhow, I guess I could somehow use the chassis air temp for fan control for the hybrid cooler. I like to over complicate things a bit! :)

ItsMeAubey commented 1 year ago

Looks like the issue is my Asus EC chip that I use for the temp probe.

2022-09-21 9:19:41 AM: System.Exception: HWInfo sensor value went missing from registry: T_Sensor1 - ASUS EC
   at FanControl.HWInfo.HWInfoPlugin.Update()
   at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update()
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at FanControl.Domain.ComputerAccessLayer.Update()
   at FanControl.Domain.ApplicationClock.DoActions()
--- End of stack trace from previous location ---
   at FanControl.Domain.ApplicationClock.<>c__DisplayClass10_0.<DoActions>b__0()
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

It begins to work immediately after I hit refresh sensors in fancontrol - seems like just, disabling the error detection would fix it in my case lol.

chjohans commented 1 year ago

@ItsMeAubey You could just compile it yourself without the error handling.

@Rem0o Could you please reconsider your error handling? The way it is now it actually "breaks everything" if an error is encountered, in an ideal world that would be fine but our world is less than ideal. IMHO it would be better if you just logged the error in the logfile, without popping up an error message. That way FanControl would just silently continue to work in most cases, even with a temporary error like this.

You could have an optional "Show message box and stop working on error" option for those who like it the way it is. :)

ItsMeAubey commented 1 year ago

Agreed. Right now, a tiny blip from a missing sensor grinds everything to a halt when the sensor usually comes back on the next read cycle.

chjohans commented 1 year ago

In my case, the coolant temperature sensor from Corsair iCUE seems to be among the culprits:

22/9/2022 00:14:27: Unhandled exception in FanControl v1.0.0.0 22/9/2022 00:14:27: System.Exception: HWInfo sensor value went missing from registry: CPU Cooler Coolant Temperature - Corsair RGB PRO XT at FanControl.HWInfo.HWInfoPlugin.Update() in C:\projects\fancontrol-hwinfo\HWInfoPlugin.cs:line 72 at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update() at System.Collections.Generic.List1.ForEach(Action1 action) at FanControl.Domain.ComputerAccessLayer.Update() at FanControl.Domain.ApplicationClock.DoActions() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler) 22/9/2022 01:07:42: Unhandled exception in FanControl v1.0.0.0 22/9/2022 01:07:42: System.Exception: HWInfo sensor value went missing from registry: CPU Cooler Coolant Temperature - Corsair RGB PRO XT at FanControl.HWInfo.HWInfoPlugin.Update() in C:\projects\fancontrol-hwinfo\HWInfoPlugin.cs:line 72 at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update() at System.Collections.Generic.List1.ForEach(Action1 action) at FanControl.Domain.ComputerAccessLayer.Update() at FanControl.Domain.ApplicationClock.DoActions() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

Maybe another strategy could be to try to read the sensor twice or even three times before throwing the error? It would not be critical to miss an update cycle or two, and this would probably solve most of these error conditions.

Idealy this could be solved outside of this plugin, at the source, but I don't see that happening.

chjohans commented 1 year ago

@Rem0o How often do you poll the sensors? And in the case of this plugin, how often do you read the registry values?

Rem0o commented 1 year ago

1hz

chjohans commented 1 year ago

I guess that means one hertz, like in once per second?

Rem0o commented 1 year ago

Yes.

chjohans commented 1 year ago

@Rem0o Would you be willing to do some changes in the error handling to resolve this, I realize that the core issue is not in FanControl or the HWiNFO plugin, but it is what it is and I don't see "the source" doing anything about this.

If not I will fork your code (if you don't mind) and have a go at it, I'm rusty with programming as it has been a while (like years), but I need this to work, and if that's the only way I'll give it a try.

I'm thinking a good strategy could be: Reuse the old sensor value for up to X update cycles (X could be somewhere between 2 - 5) Log a warning each time a sensor value is missing, as long as X update cycles have not been reached Throw the same error as now if X update cycles have been reached and the same sensor is still missing

Thanks!

Rem0o commented 1 year ago

@chjohans I do think some robustness could be added to allow some missed update cycles. Feel free to branch and open a PR.

chjohans commented 1 year ago

@Rem0o I forked your repo and I've made a few changes: 1) The Update() method is now allowed to fail up to five times in a row, each time it fails it will just write a warning with the name of the sensor that failed in the log file. 2) If it fails a 6th time it will throw an error and call close(), eg the current bevaviour.

I ran into two issues:

1) The "result.MissingSensor.Name" you just added seems to always return the last sensor in the registry (eg the one with the highest number), no matter what sensor actually fails (I just simulate sensor failing by disabling individual sensors in HWiNFO). And no matter which sensor I disable (and I verify that the one I disable disappears from the registry) your "result.MissingSensor.Name" contains the last sensor in the registry. Any idea here?

2) I tried to avoid messing with the sensor update logic at all, and when I ignore a failed sensor it seems that the sensor that fails (or I disable) changes its value in FanControl. Any idea?

Maybe this strategy is not a great idea after all, since HWiNFO will reorder the sensor numbers in the registry depending on which sensor fails/disappears. So I am pretty sure FanControl would end up reading wrong sensor values for the wrong sensors until the failed sensor comes back.

So maybe a better strategy would be to call Initialize() if/when a sensor fails, that would then have to be done the first cycle it fails (to avoid the above problems). And if it still fails just throw the error and call it a day. At least this should catch those situations where a sensor just disappears for a very short time.

What do you think?

And in any case, could you please have a look at "result.MissingSensor.Name" as it seems to always be wrong unless it's the very last sensor in the registry that fails.

Rem0o commented 1 year ago

The "result.MissingSensor.Name" you just added seems to always return the last sensor in the registry (eg the one with the highest number), no matter what sensor actually fails (I just simulate sensor failing by disabling individual sensors in HWiNFO).

I just realized the index will change if a sensor is missing, hence why only the last will return null. Let say you had 25 sensors, 1 goes missing, index 1 to 24 will remain, but the sensor they represent won't be the same. Index 25 will then return null. The fail(sensor) I added is meaningless in that sense.

Maybe this strategy is not a great idea after all, since HWiNFO will reorder the sensor numbers in the registry depending on which sensor fails/disappears. So I am pretty sure FanControl would end up reading wrong sensor values for the wrong sensors until the failed sensor comes back.

Yup, that was the main issue why as soon a 1 sensor disappear, I fail the whole thing. Didn't remember that detail. With that in mind, it will be a lot more complicated to have some tolerance for missing sensors.

The current, lightweight index based strategy only works if the sensors present in the registry are constant.

We could reorder the index in our sensors with the new order, but then we would have to reorder again when the sensor comes back, and then we would have to make sure that all original sensors when we first started the plugin are the ones present, and not some new sensor, like if say a user would add a sensor from HWInfo after the fact, which would again mess the order, thus the values.

Else we could just reread every name and value at every cycle, ignore the indexes, and use a hashmap with the sensor names to assign the values. That would completely fix the order issue, but then we would still have to create the logic to account for missing sensors/new sensors added. That solution would obviously cost more CPU cycles at every update which is not ideal, as low CPU usage is a top priority.

chjohans commented 1 year ago

@Rem0o Thanks, I arrived at pretty much the same conclusions as you.

So I changed my strategy to just ignore any missing sensor for one update cycle and just log a warning in the log file. I don't try to log the missing sensor name as that is pretty much impossible to know with the current implementation. If one or more sensors are still missing the second update cycle it will throw an exception like before.

But then I realized that HWiNFO by its default setting has a 2hz update frequency, vs the 1hz update frequency of FanControl. So to give HWiNFO to have a chance to resolve and update a missing sensor I changed my strategy yet again to now ignoring any errors for two consecutive update cycles

This should at least adds some robustness, and catch cases where sensors are missing for up to two seconds. The downside to this is that during those two seconds the reported value for the missing sensor will be wrong. But, at least for me, that's better than if FanControl just stops working and requires a "manual refresh" daily, and sometimes more than once every day.

The best solution would probably be to rewrite the update logic, but that's a bit out of scope for what I can do at the moment.

I will test this for a few days and see if it resolves the errors I'm experiencing, and if it does I'll open a Pull Request so you can decide if you want to implement my changes or not.

chjohans commented 1 year ago

@ItsMeAubey In case you want to try out my changes to see if they help with your issue you can download test.zip from here: https://github.com/chjohans/FanControl.HWInfoTest/releases/tag/0.1

Rem0o commented 1 year ago

@chjohans @ItsMeAubey

Just committed a nice compromise I think that should address most issues. It will invalidate any missing sensor, while keeping other sensors functional. After 10 consecutive update with a missing sensor, it will throw as it did before.

ItsMeAubey commented 1 year ago

@Rem0o

Interesting. I'll give it a shot!

chjohans commented 1 year ago

@Rem0o That sounds like a good solution, will try it out!

ItsMeAubey commented 1 year ago

Looking good so far. No errors and my sensor logic seems to be working fine. I'll torture test it by sleeping and waking a bunch of times and report back.

chjohans commented 1 year ago

@Rem0o I just tried this out, and just disabled reporting value to gadget for one of the sensors. FanControl immediately threw an exception:

23/9/2022` 02:28:06: Unhandled exception in FanControl v1.0.0.0
23/9/2022 02:28:06: System.ArgumentException: An item with the same key has already been added.
   at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
   at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](IEnumerable`1 source, Func`2 keySelector, Func\`2 elementSelector, IEqualityComparer`1 comparer)
   at FanControl.HWInfo.HWInfoRegistry.UpdateValues(HWInfoPluginSensor[] sensors) in C:\projects\fancontrol-hwinfo\HWInfoRegistry.cs:line 66
   at FanControl.HWInfo.HWInfoPlugin.Update() in C:\projects\fancontrol-hwinfo\HWInfoPlugin.cs:line 74
   at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update()
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at FanControl.Domain.ComputerAccessLayer.Update()
   at FanControl.Domain.ApplicationClock.DoActions()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

When I added reporting for the same sensor again it threw the same exception once again:

23/9/2022 02:29:21: Unhandled exception in FanControl v1.0.0.0
23/9/2022 02:29:21: System.ArgumentException: An item with the same key has already been added.
   at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
   at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](IEnumerable`1 source, Func`2 keySelector, Func`2 elementSelector, IEqualityComparer`1 comparer)
   at FanControl.HWInfo.HWInfoRegistry.UpdateValues(HWInfoPluginSensor[] sensors) in C:\projects\fancontrol-hwinfo\HWInfoRegistry.cs:line 66
   at FanControl.HWInfo.HWInfoPlugin.Update() in C:\projects\fancontrol-hwinfo\HWInfoPlugin.cs:line 74
   at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update()
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at FanControl.Domain.ComputerAccessLayer.Update()
   at FanControl.Domain.ApplicationClock.DoActions()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

This is probably caused by the fact that HWiNFO will reorder the ID for the sensors in the registry when one sensor is failing. The failing (disabled) sensor also immediately changes it's value, in my case I disabled reporting for my pump, the reported speed was 2316 RMP and this immediately fell to 1113 when I disabled reporting for this sensor in HWiNFO. That happens to be the speed of the HWiNFO sensor with the highest ID in the registry.

Also, since this error is thrown I think it never reaches your "if (++_updateFailCount >= 10)" test.

It would also help if you could please add logging of the failed sensor(s) even when (_updateFailCount < 10), eg something like this:

if (++_updateFailCount >= 10)
{
    var names = String.Join(", ", result.MissingSensors.Select(x => x.Name));
    Close();
    throw new Exception($"HWInfo sensor value went missing from registry: {names}");
}
else
{
   var names = String.Join(", ", result.MissingSensors.Select(x => x.Name));
   System.IO.File.AppendAllText("log.txt", Environment.NewLine + DateTime.Now + ": " + $"HWInfo sensor value went missing from registry: {names}");
}

But as I wrote above, when testing the way I did it never reached the if (++_updateFailCount >= 10) test. Which I find a bit puzzling because I basically added exactly the same test previously and it would reach that test just fine.

Now I'm not 100% sure that my test case of disabling reporting for a sensor to the gadget resembles exactly what happens in those cases where HWiNFO is losing a sensor for a moment. But my guess is that it does.

I added the above logging and compiled my own version so I'll leave it running for a while and see what happens if/when the actual error happens again.

Thanks for all your efforts, I will help out with this in any way that I can.

ItsMeAubey commented 1 year ago

I don't think disabling a sensor is equivalent to one failing an update. Disabling the sensor would adjust the index number of every other sensor so not really the same thing as a sensor failing on a single cycle. I don't know anything about how hwinfo works so I could be totally wrong.

I just raised the "EC cycle" (???) setting in hwinfo to 5 with a cycle time of 1000ms (not at PC so don't remember exact names) which artificially makes the sensor appear only 1 in 5 updates or once every 5 seconds. The sensor vanished from Fancontrol but I recieved no error. Maybe try doing that, but set the cycle setting to 11. That should exceed the 10 error max in the plugin and you should see the error pop up?

I'll do it once I'm home if you can't figure it out because of my terrible approximations to the actual setting names LOL!

Rem0o commented 1 year ago

@chjohans my mistake for the dictionary, meant to use Id, not name. Name works on my machine because I don't have any duplicate like you do.

chjohans commented 1 year ago

@ItsMeAubey Not sure about your "EC cycle" logic or what that is, I am curious though. Are those settings exclusive for your EC sensors?

But I'm assuming that when this error occurs, it's because HWiNFO itself loses the sensor. And I assumed that the response to this would be equal to when a sensor is disabled. But it might just be that it does not re-order the IDs in the registry when such an error happens, in which case my test is invalid. It makes more sense if that's the case, as HWiNFO would probably expect the sensor to "come back".

It would help a lot in debugging this if the logging I added above were implemented, then you would see in the logfile any sensor failures that did not throw an exception. Like it is now you will never know if the error actually happened or not, since it's not being logged.

chjohans commented 1 year ago

@Rem0o Aha, that makes sense. May I assume you will change that to use ID as key instead? I have a few duplicate names. :)

Went through all your changes and see that you handle the re-ordering in the registry just fine now. Thumbs up!

But please add some kind of logging of missed sensors like I suggested above, sorry to be a nag. but logging is one of my (many) hangups.... :)

chjohans commented 1 year ago

@Rem0o I see that you made the change from Name to Id. I actually made the same change myself earlier but got exactly the same exception. So waited for you to make the change, which was identical, and I still get the same error when running your latest version.

23/9/2022 05:07:30: Unhandled exception in FanControl v1.0.0.0
23/9/2022 05:07:31: System.ArgumentException: An item with the same key has already been added.
   at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
   at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](IEnumerable`1 source, Func`2 keySelector, Func`2 elementSelector, IEqualityComparer`1 comparer)
   at FanControl.HWInfo.HWInfoRegistry.UpdateValues(HWInfoPluginSensor[] sensors) in C:\projects\fancontrol-hwinfo\HWInfoRegistry.cs:line 66
   at FanControl.HWInfo.HWInfoPlugin.Update() in C:\projects\fancontrol-hwinfo\HWInfoPlugin.cs:line 74
   at FanControl.Domain.BackendProviders.Plugin.PluginBackendProvider.Update()
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at FanControl.Domain.ComputerAccessLayer.Update()
   at FanControl.Domain.ApplicationClock.DoActions()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

Any idea?

Oh, and I made a PR for the additional logging. :)

chjohans commented 1 year ago

@Rem0o Your ID was not unique as both name and label can be identical, that's why the ToDirectory failed for me. I changed this to include unit type as well, which makes it unique, or rather "as unique as it gets". :)

I also improved logging and this time I used your logger object instead of native IO to write to the log.

Great job on the new logic, I have tested this quite extensively and it now works as expected. It handles one or more missing sensors just fine, and logs them as missing in the log file (with my changes). The rest of the sensors continue to work just fine. The missing sensor(s) gets a zero value. After 10 update cycles (10 seconds), it throws an error like before and of course then stops working. It handles that sensors get re-indexed in the registry as well, so great job on that.

I created a Pull Request with my changes, please review. I'm using the Id and not the name for logging/exception as it makes it easier to identify which sensor has failed.

ItsMeAubey commented 1 year ago

No problems on my end so far. FanControl ran all night with no problems.

The only final nitpick I have is that it would be nice if the sensor could hold the last known value instead of being reset to zero. Zero is fine for my ambient temperature sensor but could be disastrous for a GPU etc.

In my PC the iffy sensor is on a 5 minute average, so a couple seconds of 0c is fine, but it may not be OK for other people or applications - GPUs, air coolers, etc.

Great teamwork folks. Thank you both!

chjohans commented 1 year ago

@ItsMeAubey Great to hear, and @Rem0o just merged my changes to his master branch so if you get the newest version you'll get logging of any failed sensors that have been ignored.

@Rem0o I do have to agree with the comment that it probably would have been even better, if instead of invalidating a missing sensor, it could retain the value from the previous update cycle throughout the 10 cycles a sensor is allowed to be "missing" before throwing an exception. One can argue that it's more likely that the actual sensor value is closer to the previous reading than zero, of course depending on the reason it has failed. So instead of invalidating the sensor, do you see an easy way of just keeping the previous value? Maybe it's enough just not to call invalidate() for the missing sensor?

Setting a temp sensor to zero might very well turn off one or more fans, and that is not very desirable. So maybe invalidating sensors of the type RMP but reusing the previous value for sensors of type Temperature could be a compromise? Or just leaving both types at the previous value would be fine too.

In any case, your recent changes make this way more robust so thanks a million! I'll contribute if and where I can.

Rem0o commented 1 year ago

@chjohans look my latest change. It will only null the values if the plugin is closed.