Open cjfields opened 5 years ago
Hi,
Right now, each subassembly step using idba
is specified to use 2 threads as a hard-coded constant. I think your measure to scale back the number of threads to Athena compared to the resource limit so that the peak number of active threads is less than this limit is a good for now. Are you observing significant performance degradation without doing that?
Thanks, alex
Hi @abishara we were seeing the server running with a very high load which appeared to degrade overall performance. We're also running this SLURM using cgroups which can restrict resource use as well, though we are utilizing the entire node.
Hmm, is the solution you suggested working fine for now? I guess I'm a little unclear on what you would like me to do. I'm open to a suggestion if you have one now that I have explained the implementation detail.
Thanks, alex
The workaround is fine for now, but I think the best fix would be for Athena to launch the various subassembly processes in a manner that scales to the threads set in the command line call, otherwise I expect other parallel steps can't take full advantage of the set number of threads. Not sure how easy this would be to implement though.
I agree this is ideal, but I don't really see a simple solution for how to implement this at the moment. Right now idba
is invoked as a black box and I don't really have fine-grained control over how it uses threads.
Agreed. I suspect the easiest solution may be to use the cluster-based approach (e.g. use SLURM to submit these as jobs).
I have noticed when running Athena with multiple threads that the spawned processes for idba_subasm are running with two threads apiece.
The Python code, which I am guessing processes the output, each actively use roughly one core apiece.
If we try to automate this a bit and run this on a cluster where the number of threads/cores per job can be set dynamically per job, for example SLURM:
this leads to load problems. For example, on a small 12 core node this launches Athena with 12 threads. However, when I've monitored this via top there are 12 processes running (a mix of idba_asm and python) with the
idba_subasm
processes requested more threads than are available, roughly about 30-40% more, which is causing issues with load.For the time being we're simply modifying the threads in the athena-meta call to about 2/3 of what are expected (8 threads for a 12 core node). But is there a way to account for this within athena?