Open GoogleCodeExporter opened 8 years ago
Original comment by Yann.Tre...@gmail.com
on 23 Sep 2010 at 7:37
Original comment by Yann.Tre...@gmail.com
on 14 Jun 2011 at 5:53
This happens for our test setup even if we don't set [assembly:
DegreeOfParallelism(X)] where X is larger than Environment.ProcessorCount.
Original comment by scottle...@gmail.com
on 10 Dec 2011 at 11:38
I think I isolated the problem to some very subtle race condition in
Gallio.Common.Concurrency.WorkScheduler, but only when used recursively, such
as when using lots of Row attributes when the whole TestFixture is
Parallizable. The Run method would return before all actions in the work set
had fully executed. The problem occurs more often with a faster machine or at
least one with more logical processors.
Anyway, the race condition is so subtle that I couldn't figure out why it was
occurring exactly :) even after quite a few hours of debugging and fiddling
with the code. So I just rewrote the WorkScheduler in a different way. It
fixed my problems and passes the original unit tests.
Unfortunately, I can't prove 100% I fixed the problem and not just hid the
problem with different timings, but it works for me...
Original comment by scottle...@gmail.com
on 13 Dec 2011 at 7:00
Attachments:
Attached is a very simple sample test project that demonstrates the problem.
I've reproduced the problem on 3 different 4-core machines with hyper-threading
enabled by setting DeegreesOfParallelism to 8 or above. If the problem doesn't
occur just increase the number of rows or tests and it should occur eventually.
Original comment by scottle...@gmail.com
on 23 Jan 2012 at 2:12
Attachments:
The patched implementation has some issues.
Suppose that the Run() thread starts up #DOP threads. Then it will itself run
the next action.
Meanwhile, one or more of those other threads might finish up their job.
However, no new work can be scheduled until the Run() thread finishes its
action and notices that it needs to schedule new work. So we get less
utilization of processor cores than we would expect.
The original implementation of WorkScheduler does not have this problem. New
work can be scheduled on worker threads regardless of whether the Run() thread
is currently busy. That's because each worker thread is able to pick up
additional work for itself.
I suggest you attempt a more targeted fix to the termination condition of the
original implementation.
Unfortunately, I can't find the bug. As far as I can tell, the termination
condition of the Run() method guarantees that it will only exit when its work
set has no more pending actions and none are in progress.
Do you have any idea where things go wrong here?
Original comment by jeff.br...@gmail.com
on 25 Mar 2012 at 12:06
Could it be that any of these actions are throwing ThreadAbortException?
Original comment by jeff.br...@gmail.com
on 25 Mar 2012 at 12:10
Ah, I see what you mean. My implementation definitely under utilizes the
threads, so my original fears are probably true, I just hid the problem with
different timings. I really have no clue where things are going wrong.
I'll investigate if there is any way a ThreadAbortException could be occurring
as well as just make one more pass to see if I can figure out any other
possible problems.
Original comment by scottle...@gmail.com
on 29 Mar 2012 at 5:26
It doesn't appear to be related to a ThreadAbortException. I'm leaning towards
some the problem being somewhere in Gallio.Model.Contexts.ObservableTestContext
or Gallio.Framework.Patter.PatternTestExecutor but haven't ruled out the
WorkScheduler.
With the original WorkScheduler I can get all my tests to run by just
commenting out the Dispose call in
ObservableTestContext.HandleParentFinishedBeforeThisContext, but that should
only be called if the parent test context finished and I can't see how that's
happening. Anyone else have any ideas?
Original comment by sle...@xignite.com
on 30 Mar 2012 at 4:22
Is there a workaround for this?, or how far off is the fix?
Original comment by mmussm...@gmail.com
on 25 May 2012 at 1:19
The only work around I have found is to place the tests in their own test
fixture and not parallelize the test fixtures.
I have also improved this using Jenkins to run multiple fixtures in parallel by
running each fixture in a separate job like so:
Gallio.Echo.exe "<MyTestAssembly>.dll" "/f:Type:<FixtureClassName>"
/report-type:xml-inline /runner:IsolatedProcess /verbosity:debug
/report-name-format:TestReport
Obviously this defeats the purpose of parallelism in MBUnit, but we have few
test methods that run hundreds of tests cases from xml so it works for us as
the test cases run in parallel, which is the important thing for us.
Original comment by tvarg...@gmail.com
on 25 May 2012 at 2:09
This problem is still not fixed.
Using a DOP of 8 will yield at best 4 simultaneous tests to run. Increasing it
will cause random orphaned test failures(8 cpus available). The only way I was
able to get it to run DOP number of tests without failing was to not use ANY
row, xml or factory tag in any test and shove them all in the same fixture. Why
are fixtures eating up threads anyways?
Original comment by patrickb...@gmail.com
on 18 Sep 2012 at 1:22
Please fix this issue as soon as possible. Thanks
Original comment by raymond....@gmail.com
on 5 Dec 2012 at 1:17
Original issue reported on code.google.com by
justin.w...@gmail.com
on 22 Sep 2010 at 10:09