sp00n / corecycler

Script to test single core stability, e.g. for PBO & Curve Optimizer on AMD Ryzen or overclocking/undervolting on Intel processors
Other
726 stars 32 forks source link

Update check fails #83

Open moyushang opened 1 month ago

moyushang commented 1 month ago

Starting CoreCycler v0.9.6.2... Press CTRL+C to abort FATAL ERROR: 无法对 Null 数组进行索引。 Line Number: 11189

You can find more information in the log file: Y:\CoreCycler-v0.9.6.2\logs\CoreCycler_2024-08-25_05-07-59_PRIME95_SSE.log When reporting this error, please provide this log file. Press Enter to exit:

CoreCycler_2024-08-25_05-07-59_PRIME95_SSE.log

sp00n commented 1 month ago

Huh. You could try to disable the update check in the config.ini by setting enableUpdateCheck = 0 within the [Update] section.

Do you maybe need a proxy/VPN to access Github? The scripts tries to access https://api.github.com to check for an update, but should "fail successfully" (i.e. without a FATAL ERROR) when it can't reach it. It may be freaked out by some setting though. Disabling the check also disables the web query.

Let me know if it works after that. There's not much feedback from running on a Chinese OS yet.

moyushang commented 1 month ago

Oh ! thanks, it works. Sometimes I am lucky enough to access GitHub, but most of the time I can't access it

sp00n commented 1 month ago

Reopening this as a reminder for myself that I need to do more tests what happens if the domain cannot be reached.

moyushang commented 1 month ago

I also encountered another problem The p95 icon turns red , CPU Cores 3 Usage 0% . after a while, CoreCycler still shows no error . But when I open the log, there are errors in the log.

5800X Max CPU Boost Clock Override -200 ( 4.65G ) All Cores -18 CoreCycler Only tested the best cores (Gold Star for Ryzen Master)

CoreCycler_2024-08-26_01-09-09_PRIME95_SSE.log

sp00n commented 1 month ago

I do see an error in the log file at the end:

              +++ 04:31:54 - Checking for stress test errors
              +++            Checking the new Prime95 log entries...
              +   04:31:54
              +   Found an error in the new entries of the results.txt!
              +   There has been an error while running the stress test program!
              +   Error type: CALCULATIONERROR
ERROR: 04:31:54
ERROR: There has been an error while running Prime95!
ERROR: At Core 3 (CPU 6)
ERROR MESSAGE: FATAL ERROR: Rounding was 0.5, expected less than 0.4
              +   There was an FFT size provided in the error message, use it.
ERROR: The error happened at FFT size 3360K
              +++ Adding Event Log entry: core_error
              +++ Adding the Windows Event Log entry:
              +++ Error on Core Core 3 (CPU 6)!

There has been an error while running Prime95!
Hardware failure detected running 3360K FFT size
Error Type: CALCULATIONERROR
              +   There has been some error in Test-StressTestProgrammIsRunning, checking (#1)
              +   Trying to close the stress test program to re-start it
              +   Trying to close the stress test program
              +   Trying to close Prime95
              +   Resuming threads for process: 12596 - prime95
              +++            ID: - 7132 ok - 9340 ok - 10400 ok - 10384 ok - 8624 ok - 8460 ok
              +   Trying to gracefully close Prime95
              +++ The window process main window handle: 459642
              +++ Try 1

I also see errors while trying to suspend/resume the threads of Prime95 starting at 04:13:33, which may have been a precursor of the "true" error:

              +++ 04:13:33 - Suspending the stress test process for 1000 milliseconds
              +   Suspending threads for process: 12596 - prime95
              +++            ID: - 7132 failed!              +++ Error Code:    156
              +++ Error Message: 接收人进程拒绝此信号。
 - 9340 failed!              +++ Error Code:    156
 [...]
sp00n commented 1 month ago

According to the log file, this happened on iteration 4 of 10.

Interestingly, the log file doesn't seem to be complete, i.e. it ends early when trying to close Prime95. There should still be messages after that, but here there aren't.

The only times I've seen this happen so far is: a) when the PC crashes and the the disk write buffer is not fully flushed to the drive b) when there's an access violation error in .net or Powershell (which should show up in the Windows Event Log) c) when the folder CoreCycler is running in is part of a synchronization program (like OneDrive, DropBox, etc)

moyushang commented 1 month ago

"b" Where should I look

a: uncertain (But there was no blue screen, Windows was still running)

c :I don't have "DropBox" installed , have disabled "OneDrive" after installing Windows

Time ≈ 04:31:54 I manually closed p95. And a few seconds later, I closed CoreCycler.

sp00n commented 1 month ago

a) Yeah, that only happens when the PC reboots b) I seem to have deleted my old Windows Event Log entries, but it should show up under Windows Logs\Application as an Error entry.

The previous errors starting at 04:13:33 in the log when trying to suspend and resume the process threads are "156 - ERROR_SIGNAL_REFUSED", indicating that it somehow couldn't access Prime95 anymore. I currently do not treat this as a "real" error and just ignore it, but it could very well be connected with a hanging or partly crashed process (or something "claiming" the process, like e.g. a synchronizing tool, which is why I mentioned this).

You could try to update to the 0.10.0.0 alpha version, where I did some refactoring, and/or you could also switch the suspend & resume method from Threads to Debugger by setting modeToUseForSuspension within the [Debug] section, which then doesn't try to suspend the threads individually, but the whole main process instead. Which might be helpful or not.

moyushang commented 1 month ago

OK, thanks. I will try the "v0.10.0.0 alpha" version

in addition , I only found these in the logs.

屏幕截图 2024-08-26 220521 屏幕截图 2024-08-26 220511
sp00n commented 1 month ago

VSS is Volume Shadow Copy, I have these entries as well. The other is a CoreCycler entry, which I added specifically for cases when e.g. the log file is being corrupted. Or for a general quick overview of any errors that happened. Unfortunately it does not tell us what happened at the end of the log file.