Closed Hou5e closed 6 months ago
It had the same results for with FAH v8.3.10, on Windows for this issue. The log didn't rotate and the last log entry was 5 hours ago just before 0 UTC while it's still folding.
23:55:38:I1:WU1633:Completed 75000 out of 1250000 steps (6%)
23:55:44:I1:WU1633:Checkpoint completed at step 75000
23:56:38:I1:WU1633:Completed 87500 out of 1250000 steps (7%)
23:57:32:I1:WU1633:Completed 100000 out of 1250000 steps (8%)
23:57:38:I1:WU1633:Checkpoint completed at step 100000
23:58:33:I1:WU1633:Completed 112500 out of 1250000 steps (9%)
23:59:27:I1:WU1633:Completed 125000 out of 1250000 steps (10%)
23:59:33:I1:WU1633:Checkpoint completed at step 125000
*********************** Log Rotated 2024-05-09T00:00:27Z ***********************
Is this from the actual log file or from Web Control's view of it?
Both, that is what was trying to show. That you can't see anything past 0 UTC in the GUI, and the entire log file for both days looks like this: FAH v8.3.10: log-20240509-190042.txt
FAH v8.3.9: log-20240508-191453.txt
I'll try running the Debug version today to see if there are any clues with that version.
Those log files look normal. log.txt
is moved to logs/log-YYYYMMDD-HHMMSS.txt
. The last line is the "Log Rotated" line. Then it should continue with a new log.txt
. Are you saying there's no new log.txt
?
The filenames only look normal because I closed FAH, and restarted it, where the log.txt is rotated properly on FAH startup (this restart happened each day to update to the new version of FAH). If I had left FAH running then the file would still be log.txt
in the main data folder (not logs/log-YYYYMMDD-HHMMSS.txt
), and there would be no new log data after 0 UTC (in the UI or the log file).
In the initial startup of the Windows FAH log header, is the $( __ $)
supposed to be there in the Options:
for the Client and CBang sections?
Example from Client section: Options: $( /TP $) /std:c++14 /nologo /EHa /wd4297 /wd4103 /W1 /MTd
Running with a 10 minute log rotate with the Debug version, I didn't see anything new or interesting. The log just stops at the log rotate like the other examples. I stopped and restarted FAH to get it to rotate and make the filename with the date: log-20240510-085635.txt
Got. That's what I needed to know. So to summarize, on Windows:
log.txt
is not moved.log.txt
stops and it never recovers.log.txt
is rotated into logs/
and logging proceeds normally until the next rotation.Yep, That's a good summary of the issue.
There are some minor odd side-effects seen with the short log rotation. I restarted FAH to get the next log (this one is not interesting): log-20240510-093235.txt
I restarted FAH again and there were ~3 log rotations before 'Finishing' the WU. This was what was at the end. When the WU finished the log was wiped out with the finishing info (not rotated): log-20240510-120305.txt And, the UI looked like this on the PC that 'Finished' the WU, but FAH was idle and should not say 'Disconnected' for Win10-PC:
Note: one of the other 'Disconnected' PCs was running: X99-12, and looked like this locally on that PC:
For FAH v8.3.11 Build183 (Debug) on Windows, there was a useful error in the GUI log messages, and it didn't get saved to the log.txt file. I missed capturing it from the GUI at 0 UTC when I first saw it, but got it with a 10 minute log rotate:
01:39:22:I1:WU1649:Completed 1112500 out of 1250000 steps (89%)
01:41:05:E :Exception: Failed to rename 'log.txt' to 'logs\log-20240512-014105.txt': The process cannot access the file because it is being used by another process.
01:41:05:E :
01:41:05:E : At: src\cbang\os\SystemUtilities.cpp:729:rename()
01:41:05:E :Caught at: src\cbang\event\Event.cpp:137:call()
01:41:06:E :WU1649:Stream error: 3
log-20240512-014106_GUI_Text.txt
For comparison, the same section written to the log file looks like this:
01:39:22:I1:WU1649:Completed 1112500 out of 1250000 steps (89%)
*********************** Log Rotated 2024-05-12T01:41:05Z ***********************
If FAH isn't folding (like it was Finishing a WU) then it can rotate the log: GUI messages from the screen: log-20240512-022400_GUI_Text.txt
Log files of the same events: Part 1: log-20240512-022239.txt Part 2: log-20240512-023044.txt
I finally figured this out. The problem was due to a change in the file handling. If you use the standard C++ std::fstream
on Windows there is no easy way to prevent the file handle from being inherited by subprocesses. So when the client spawns the core subprocess it gets a copy of the log file handle. Later, if the core is still running, if you close the log file and try to rotate it, it fails with an access error because the core process still has an open file handle to the log.
The solution was to implement a custom std::streambuf
which sets the correct flags when opening the file which prevent the file handle from being inherited by subprocesses. This is actually how it used to work in cbang but I tried to simplify the code which lead to several difficult bugs.
For FAH v8.3.11 Alpha (Debug) on Windows, you get 2 error messages on FAH startup where it crashes quickly with:
Last line of the log: 20:49:52:E :Thread 1 deallocated while still active
log-20240513-205115.txt
Also, both Windows installers {Release & Debug} have the debug browse information file included https://github.com/FoldingAtHome/fah-client-bastet/blob/3917839146eac0e30079900cef812224ce35a2be/install/win/fah-client.nsi#L209 that makes the installer ~3x larger. This should probably be put back to the way it was once these issues get resolved. Or, include the .pdb file for the Debug version, and not include it for the Release version.
The uninstaller section of the script should make sure it deletes that file:
Add: Delete "$INSTDIR\FAHClient.pdb"
below this line: https://github.com/FoldingAtHome/fah-client-bastet/blob/3917839146eac0e30079900cef812224ce35a2be/install/win/fah-client.nsi#L346
For FAH v8.3.13 Alpha (Debug) on Windows, it looks like the log rotate is working well and happened at 0UTC. I turned the log rotate on for 10 minute intervals after that, where it has rotated about 50 times while I was testing it.
[It works too well now... 😄 ] I went back to the default once a day log rotate with the quick WUs turned on with FAH v8.3.13 Alpha (Debug) on Windows there was enough log activity to get (2) log rotations at 0UTC (not sure it it was coincidental or if 2 threads are both trying to do the log rotation??). I truncated the 22MB log to the first few and last few entries: log-20240516-235959.txt Second Log at 0UTC: log-20240517-000000.txt
This is unrelated. The rotation triggered one second early due to a rounding error. This caused it to trigger again at the correct time. I found and fixed this issue for the next release.
At 0 UTC (5pm PST), the log rotate is initiated, but FAH v8.3.9 on Windows doesn't move to the new log file. @kbernhagen reported that this also happened for FAH v8.3.8 on Windows.
The log shown below is still at: C:\ProgramData\FAHClient\log.txt And, was not moved and renamed with the date to the 'logs' folder.
End of the log looks like this 3 hours later while still folding:
The web UI looks like this 3 hours later while folding: