Closed CDieterich closed 6 years ago
Example: right now we have 3 JACUSA instances running with -p 10
on a 40 core machine. However, all 40 cores are full and the load is >80.
When we stopped other, non-Java processes, JACUSA directly took over the freed CPUs. Right now I am not sure if JACUSA does respect the CPU limit at all (it seems to take all available cores).
Update: one of the three JACUSA processes just finished, the two other processes evenly use ~20 CPUs now, again claiming all available cores.
Does JACUSA2 constantly claim all available cores? I could not reproduce this - at least with "srun -pty bash"! When JACUSA2 is started I noticed that CPU usage peaks... I suspect that htsjdk is responsible for this but cannot fix this right away
SLURM should have no influence here since we also don't enforce the user-set CPU limit. What I can observe however, is that JACUSA constantly runs with > 20 CPUs while have specific -p 10. It's a 64 CPU machine with 3 instances of JACUSA (all running -p10
), each of the three instances takes > 20 CPUs.
FROM: http://broadinstitute.github.io/picard/faq.html
Q: Why does a Picard tool use so many threads?
A: This can be caused by the garbage collection (GC) method of Java when used on 64 bit Java. By default the JVM switches to 'server' settings when on 64 bit, which automatically implements parallel GC and will use as many cores as it can get its hands on. To get around this, we define the number of threads we allow Java for GC by specifying -XX:ParallelGCThreads=
So in essence we would have to set the number of GC threads, add the number of threads we wish to JACUSA to run on and set the sum as the SLURM allocated CPU number?
I haven't checked the default settings for java on cluster, but I think the default GC pauses the application/thread while doing its job. But this needs to be tested.
2018-06-06 15:50 GMT+02:00 Tobias Jakobi notifications@github.com:
So in essence we would have to set the number of GC threads, add the number of threads we wish to JACUSA to run on and set the sum as the SLURM allocated CPU number?
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/dieterich-lab/JACUSA2/issues/4#issuecomment-395075580, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCys_r4gHeJHH1zIjuoLl1-7JF-_jkrks5t5943gaJpZM4TbO_M .
-- Michael Piechotta
Heimstr.10 10965 Berlin Germany
Home: +49 30 92 36 89 40 Mob.:+49 176 62 38 28 74
Hi Michael,
thread management does not seem to work as expected for BETA12. -p,--threads use # THREADS
default: 1
does not seem to take effect and JACUSA takes all threads on the machine.