cygnusxi / hfm-net

Automatically exported from code.google.com/p/hfm-net
2 stars 0 forks source link

0.9.1.595 using +/-25% CPU #283

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. install 0.9.1.595
2.
3.

What is the expected output? What do you see instead?
A look at the task manager shows HFM eating up +/- 25% CPU

What version of the product are you using? On what operating system?
0.9.1.595 - can not use, Vista 64 SP2

Please provide any additional information below.
See folding form for several reported.
In my case running 2xGPU+SMP + SMP on dedicated folder.

Original issue reported on code.google.com by phils1...@gmail.com on 15 Sep 2012 at 1:12

GoogleCodeExporter commented 8 years ago
Thread post here: http://foldingforum.org/viewtopic.php?p=221015#p221015

Original comment by harlam357 on 9 Oct 2012 at 10:15

GoogleCodeExporter commented 8 years ago
We (team 33) investigated this issue recently and found it to be triggered by V7
clients. Top level: problem lies in Connection.cs logic which repeatedly invokes
timer to "check for data received on the socket".

Problems with the logic are as follows (line numbers per r595):
1. Client socket is a blocking socket so, when there are no data on the socket, 
one
   timer thread gets to sleep on blocking read here:

   480:   int bytesRead = _stream.Read(_internalBuffer, 0, _internalBuffer.Length);

   but, as timer isn't stopped, we get continued timer hits (invocations of
   SocketTimerElapsed) which consume CPU power without doing any useful work.

2. Concurrency prevention is insufficient in SocketTimerElapsed. It is entirely
   possible (and has been seen in the wild, too) that:

   (a) timer is hit and a thread[1] is allocated to call SocketTimerElapsed callback
   (b) thread[1] starts executing the callback but
   (c) gets rescheduled by the OS after executing condition check at

       433: if (!_updating)

       but before executing

       437: _updating = true;

   This opens a race window for another thread[2] (next timer hit) to execute
   Update() concurrently and possibly out of order which would be disastrous.

3. When connection is closed/reset by the client an exception is raised:

   483: throw new System.IO.IOException("The underlying socket has been closed.");

   but is later caught by SocketTimerElapsed which takes no action in such case.
   So, what happens is: logic keeps on trying to read data from the socket that
   has been shut down by peer (thus consuming CPU power).

4. Timer interval of 1ms causes even higher CPU usage on Linux. This is due to 
Mono
   actually being able to provide 1 millisecond timer resolution whereas Windows .NET
   provides 10 ms resolution.

   In other words, DefaultSocketTimerLength of 1 is practically 10 on Windows
   (100 timer hits per second). On Linux, however, it's 1 so we get close to 1000
   timer hits per second [10x as many -- sic!]

Best way to deal with these issues is dropping the timer logic completely and
creating one, dedicated thread for handling socket input (per Connection).

We've started working on said implementation.

In the mean time, see attached (and kludgy) diff (against r595) that mitigates
described issues.

Test code drop is also available: 
http://darkswarm.org/hfm-net-0.9.1.595-tear3.zip

Original comment by kszy...@gmail.com on 18 Apr 2013 at 6:45

Attachments: