sunandap / airhead-research

Automatically exported from code.google.com/p/airhead-research
0 stars 0 forks source link

Too many WorkQueues crashes the jvm #103

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a RowComparator. a WordComparator, and a LinkClustering object.  
2. Run Link clusteirng and then call the comparators in loop.
3. See that you can't create the master thread and the jvm dies

What is the expected output? What do you see instead?
First, this shouldn't crash.  Second, if the user specified a desired number of 
threads, this shouldn't use up any more than that requested nubmer.

I tried solving this problem in another branch but haven't come up with a 
pleasing answer.  Right now, my approach has been to use WorkQueue as a 
singleton class.  Everyone accesses the WorkQueue by calling the static method 
getWorkQueue(num threads) which will either create and save a new work queue 
for that number of threads or return an already created WorkQueue, even if it's 
for a different number of threads.  This works pretty well in the general case 
since the number of threads can now be fixed system wide by the main class (it 
just calls WorkQueue.getWorkQueue(requestedNumberOfThreads)) but really clogs 
things up when used in a recusive class like SpectralClustering, or anything 
that wants to recursively traverse a tree with the recursive calls being run 
inside a WorkQueue. Take this code snippet as an example:

public void traverse() {
  WorkQueue queue = WorkQueue.getWorkQueue();
  Object key = queue.registerTaskGroup(2);
  queue.add(key, new Runnable() { public void run() {traverse();} }
  queue.add(key, new Runnable() { public void run() {traverse();} }
  queue.await(key);
}

While trivial, this is essentially what SpectralClustering would do if it 
wanted to parallelize it's code internally.  The first N threads run just fine, 
but when they create new threads for the queue, all N worker threads are stuck 
in an await call and the system deadlocks.

Original issue reported on code.google.com by FozzietheBeat@gmail.com on 23 Sep 2011 at 5:19

GoogleCodeExporter commented 9 years ago
This crashing behavior seems really odd and I can't reproduce it?  The total 
number of threads should be NUM_PROCESSORS^2, which is a reasonable (though 
large) number for any JVM.  

Could use use the new fork/join stuff in Java 7 to do the recursive calling 
you're talking about?

Original comment by David.Ju...@gmail.com on 26 Sep 2011 at 5:16

GoogleCodeExporter commented 9 years ago
I've not been able to replciate the crashing behavior relaibly, but I have seen 
it in two different sets of code that i've tried using.  This may be due to the 
machines i'm using and less to do with the code.

Having a lot of threads isn't really an issue for the JVM, but it is when the 
user reuqests X threads and we use NUM_PROCESSORS^2 threads.  Almost all uses 
of the WorkQueue result in using as many threads as available, and, for 
example, ignoring the -t option in GenericMain.  We still need some mechanism 
for limiting the number of threads created throughout the code so that our code 
doesn't dominate a shared machine.

Original comment by FozzietheBeat@gmail.com on 29 Sep 2011 at 9:15