FoldingAtHome / fah-issues

49 stars 9 forks source link

Terminating cores which wont shutdown softly causes some to disable assembly optimizations #563

Closed jcoffland closed 9 years ago

jcoffland commented 13 years ago
Trac Data
Ticket 563
Reported by @jcoffland
Status closed
Component FAHClient
Priority 5
Milestone Open Beta Phase 2
Version 7.1

Passing -forceasm could force these cores to continue to use optimizations.

jcoffland commented 13 years ago

Comment by @jcoffland Also, see comments in #124.

jcoffland commented 13 years ago

Comment by @bb30994 You may want to reverse the logic of the option one the Configuration > Advanced tab for "Disable highly optimized assembly code...." As it is written, it is never used. Its purpose was to deal with computers for which x86 SIMD extensions were new. Both 3DNow!+ and SSE tended to overheat computers which were overclocked but "stable" when validation tests excluded SIMD. Some early Pentium / K6-II computers were built with an inadequate HS -- or they got dusty and nobody ever cleaned them.

Nobody ever intentionally disables SSE any more. In fact, they always add -forceasm so that even if the computer crashes once in a while, Assembly Optimizations are re-enabled automatically.

If you decide not to remove that option, at least add a radio button for -forceasm right next to it with it checked by default. Logical choices: (*) Forceasm (probably always used) ( ) Disable (never used) ( ) Neither (probably never used)

jcoffland commented 13 years ago

Comment by @PantherX Can we bump it to the next version since UNI Slots will have their optimizations disabled which isn't correct. The current patch is to use -forceasm but that might be too much to ask a novice user who wants a set-and-forget Slot.

jcoffland commented 12 years ago

Comment by @jcoffland Fixed v7.1.39

jcoffland commented 12 years ago

Comment by @bb30994 The new wrapper code does not prevent FahCore_78 from disabling optimizations.

jcoffland commented 12 years ago

Comment by @jcoffland Strange. It seems the client is terminating the core immediately rather than waiting for it to shutdown. I'm not sure why.

jcoffland commented 12 years ago

Comment by @bb30994 The shutdown messages do vary depending on which core is being killed -- and perhaps whether the client has learned that a core needs to be killed or not. Look what happens with two different cores:

00:26:29:Server connection id=5 ended 00:26:29:WARNING:Console control signal 0 on PID 2164 00:26:29:Exiting, please wait. . . 00:26:31:FS01:Shutting core down 00:26:31:FS02:Shutting core down 00:26:36:WU02:FS02:0x15:Client no longer detected. Shutting down core 00:26:36:WU02:FS02:0x15: 00:26:36:WU02:FS02:0x15:Folding@home Core Shutdown: CLIENT_DIED 00:26:39:WU00:FS01:0xa4:Client no longer detected. Shutting down core 00:26:39:WU00:FS01:0xa4: 00:26:39:WU00:FS01:0xa4:Folding@home Core Shutdown: CLIENT_DIED 00:27:32:WARNING:FS02:Killing WU02 00:27:32:Clean exit

(Neither of these cores actually disable assy optimizations so the comment is not properly on-topic.)

jcoffland commented 12 years ago

Comment by @jcoffland Something weird is happening though. Even though WU02 exited it was still killed a minute later. Meaning that the client at least thought that something didn't shutdown. I should probably print something when the wrapper exits.

jcoffland commented 12 years ago

Comment by @jcoffland So actually this ticket should not have been reopened but there may be another issue with shutting down cores.

jcoffland commented 12 years ago

Comment by @bb30994 The issue with shutting down the "old" cores is that they depend entirely on the lifeline. They do not respond to any signals. Until the wrapper can convince the FahCore to SHUT ITSELF DOWN cleanly, the -forceasm problem will keep coming up unless somebody recompiles FahCore_78 with an updated wrapper. (The Pande Group continues to create new projects for FahCore_78, so if there are any plans to deprecate it, they are not being followed.)

jcoffland commented 12 years ago

Comment by @bb30994 OK. I've retested it and I agree. It now works as it should.