Illumina / isaac2

Aligner for sequencing data
Other
21 stars 4 forks source link

isaac-sort-reference: --jobs N is ignored #2

Closed sklages closed 9 years ago

sklages commented 9 years ago

Hi,

isaac-sort-reference --genome-file $(pwd)/genome.fa --jobs 1 --seed-lengths 32 --output-directory isaac2_hg19 uses all cpu cores on the machine running. On our 48 core server the load is constantly beyond 52, in the very beginning it raises to >128! My sysadmin is going to kill me :-)

It also seems that the seed length is ignored, as the terminal output reports things like ./../libexec/iSAAC-02.15.07.16/sortReference -r Temp/contigs.xml --mask-width 0 --mask 0 --seed-length 16 [...]

best, Sven

rpetrovski commented 9 years ago

isaac-sort-reference will use all hardware threads on a node. -j becomes useful when you run it in distributed mode (--qrsh-cmd). When running on multiple nodes -j limits the number of jobs that are submitted to the grid simultaneously.

Running with -j other than 1 locally is obviously not recommended.

Internally isaac-sort-reference uses sortReference to produce various sorts of k-mers (see --annotation-seed-lengths). In the final output --seed-lengths will be respected.

Defaults are set for optimum performance assuming the resources are available. Running on a box shared with other users is not recommended. Also, if you are running isaac-sort-reference on human genome and similar, make sure you have in the order of 150G RAM on the box. Otherwise you might need to use a higher --mask-widht value to split the data.

Final note: Make sure you wipe out Temp folder if you decide to rerun with different command line.

Roman.

sklages commented 9 years ago

Hi Roman,

so you won't allow for the user to restrict cpu cores? This applies to all isaac-* binaries? This makes isaac (at least in my hands and in my environment) completely unusable as we run most of our stuff on a compute cluster .. Nevertheless I consider "lack of resource control" as a undesirable behaviour ..

best, Sven

rpetrovski commented 9 years ago

isaac-align can be run with restricted CPU. Again, this isn't what users should normally do. If you want the efficiency of processing, it is best to find a way to reserve an entire cluster node for the job.

Roman.

rpetrovski commented 8 years ago

iSAAC-02 is an open source project. If you have a sound proposal on this or any other issue, please put it into a pull request.