Illumina / isaac2

Aligner for sequencing data
Other
21 stars 4 forks source link

isaac-sort-reference: --jobs 1 results in 2400% CPU load? #12

Closed sklages closed 8 years ago

sklages commented 8 years ago

Hi,

I just wanted to create a GRCh38 index for iSAAC (iSAAC-02.16.03.29).

isaac-sort-reference \
--genome-file genome.main_chr.fa \
--jobs 1 \
--output-directory iSAACIndex.k32.main_chr \
--seed-lengths 32

This produces a heavy cpu load. I ran two jobs on a 80-core machine with 1TB RAM. The CPU load was constantly raising over 80, up to insane 128 ... I killed my jobs as I don't want to overload the machine and to avoid that my sysadmin will kill me :-)

--jobs 1 should use just one CPU, shouldn't it? If not, how can I direct isaac-sort-reference to just use a specified number of cpu cores?

Did I miss something?

best, Sven

rpetrovski commented 8 years ago

--jobs is there to control the number of parallel jobs when running on the grid. Internally, some jobs like the neighbor search will utilize all available cores. For this reason, it is unhelpful to run multiple isaac-sort-reference on the same box. When running locally --jobs 1 is the right thing to use (which is the default) and running more than one isaac-sort-reference on the same box (same as with isaac-align) is not going to improve the productivity.

sklages commented 8 years ago

well, .. how can I restrict isaac-sort-reference to a certain number of CPUs? We have quite a few, potent servers. But these servers are not mine. I cannot just use all CPU cores available ... I am (not always) alone ... And, to be honest, this behaviour is not very polite :-)

If it takes longer to build an index with just say 48 CPUs .. OK, no problem. For me it's more important to have control over the resources a program is going to use ..

So maybe you can consider this a "feature" request for some upcoming version ..

rpetrovski commented 8 years ago

Unfortunately there are no current plans to accommodate hardware sharing. If you don't mind getting your hands dirty, you can add -j to invocations of findNeighbors. Mind that there are two instances on lines 76-100 in make/reference/FindNeighbors.mk. One for mask width 0, the default with best efficiency when there is enough RAM, the one you are using, and another for other mask values when there isn't enough RAM to generate all 80-mers for the entire genome in memory.

Roman.

sklages commented 8 years ago

Thanks for the info. Though I still consider it a bad habit for a multithreaded software to leave the user without control over CPU cores to use in a multithreaded environment :-)

rpetrovski commented 8 years ago

iSAAC-02 is an open source project. If you have a sound proposal on this or any other issue, please put it into a pull request.

sklages commented 8 years ago

I haven't read that before, .. you mentioned that isaac-align works the same way? I cannot specify the number of cpu cores to use?

edit: OK, here I can control the number of threads to use:

-j [ --jobs ] arg (=24)  Maximum number of compute threads to run in parallel
rpetrovski commented 8 years ago

-j works on isaac-align. I am saying that it is counterproductive to run multiple competitive processes on the same piece of hardware meaning that you will get your results faster if you run the analyses one by one.

Say you have to runs that would exclusively take 1 hour each. Assuming they efficiently use the hardware (i.e. whichever the bottleneck is IO or CPU, it get utilized at 100%), if you run both concurrently, you get both to finish in about 2 hours. If you run them one by one, firstr you'll benefit from getting the results of the first run first (that's 1 hour from start and not 2). Second, you'll benefit from not having the inefficiencies resulting from process competition for hardware such as cache trashing (both memory and IO), excessive peak demands for memory which will be half that with one process working on its data at a time and such.

sklages commented 8 years ago

ok, got it. I won't run different aligner jobs all on the same machine, but I need to make sure, that they use just a certain number of CPUs. Your scenario described is similar for most i/o intensive jobs, e.g. bcl2fastq. But "benefit" is not always speed. E.g. when creating an index I usually don't care if it takes 12h or 24h. I don't do that every day .. But I need control. We use a compute cluster here with a lot of himem 80-core servers. Jobs are always run on one single node, jobs never get distrubuted on multiple nodes. If a program "occupies" all CPUs on a server with an I/O intensive job, it probably won't use all CPUs due to I/O being the bottleneck, thus blocking this machine for other users. Other users may run some MEM intensive jobs or do some number crunching with CPU bound jobs. I don't know, .. the worst-case scenario would be that another user is also running an I/O bound job on multiple CPUs on the same machine .. may happen ..

"benefit" here is that I can control the resources to be used as I know my environment best :-)

Coming back to isaac-sort-reference .. I now use all CPUs for all four jobs on four different machines. From time to time CPU load goes up using 40-80 cores, but half of the time it does not. So here I assume I/O being the bottleneck. Knowing this I probably would have started these jobs with no more than 16 cores ... (if I could) ..

just my 2p, now it's running ..

sklages commented 8 years ago

oops, .. I wasn't aware that this has already been discussed in #2 ... I haven't worked with isaac2 for quite a while ... :-)