dieterich-lab / JACUSA2

New version of JACUSA -> 2.0
GNU General Public License v3.0
23 stars 3 forks source link

Threads gone mad #4

Closed CDieterich closed 6 years ago

CDieterich commented 6 years ago

Hi Michael,

thread management does not seem to work as expected for BETA12. -p,--threads use # THREADS default: 1 does not seem to take effect and JACUSA takes all threads on the machine.

tjakobi commented 6 years ago

Example: right now we have 3 JACUSA instances running with -p 10 on a 40 core machine. However, all 40 cores are full and the load is >80.

When we stopped other, non-Java processes, JACUSA directly took over the freed CPUs. Right now I am not sure if JACUSA does respect the CPU limit at all (it seems to take all available cores).

tjakobi commented 6 years ago

Update: one of the three JACUSA processes just finished, the two other processes evenly use ~20 CPUs now, again claiming all available cores.

piechottam commented 6 years ago

Does JACUSA2 constantly claim all available cores? I could not reproduce this - at least with "srun -pty bash"! When JACUSA2 is started I noticed that CPU usage peaks... I suspect that htsjdk is responsible for this but cannot fix this right away

tjakobi commented 6 years ago

SLURM should have no influence here since we also don't enforce the user-set CPU limit. What I can observe however, is that JACUSA constantly runs with > 20 CPUs while have specific -p 10. It's a 64 CPU machine with 3 instances of JACUSA (all running -p10), each of the three instances takes > 20 CPUs.

piechottam commented 6 years ago

FROM: http://broadinstitute.github.io/picard/faq.html

Q: Why does a Picard tool use so many threads? A: This can be caused by the garbage collection (GC) method of Java when used on 64 bit Java. By default the JVM switches to 'server' settings when on 64 bit, which automatically implements parallel GC and will use as many cores as it can get its hands on. To get around this, we define the number of threads we allow Java for GC by specifying -XX:ParallelGCThreads=. An alternative approach is to turn off Parallel GC by specifying -XX:+UseSerialGC. However, we found this process to be sub-optimal since a full GC sweep is the only type performed, which seems to take much longer than parallel GC. In many cases, it is not required (parallel GC employs ~7 different types of GC). See here for further details of the tunable parameters.

tjakobi commented 6 years ago

So in essence we would have to set the number of GC threads, add the number of threads we wish to JACUSA to run on and set the sum as the SLURM allocated CPU number?

piechottam commented 6 years ago

I haven't checked the default settings for java on cluster, but I think the default GC pauses the application/thread while doing its job. But this needs to be tested.

2018-06-06 15:50 GMT+02:00 Tobias Jakobi notifications@github.com:

So in essence we would have to set the number of GC threads, add the number of threads we wish to JACUSA to run on and set the sum as the SLURM allocated CPU number?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/dieterich-lab/JACUSA2/issues/4#issuecomment-395075580, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCys_r4gHeJHH1zIjuoLl1-7JF-_jkrks5t5943gaJpZM4TbO_M .

-- Michael Piechotta

Heimstr.10 10965 Berlin Germany

Home: +49 30 92 36 89 40 Mob.:+49 176 62 38 28 74