Closed dkioroglou closed 3 years ago
The following command:
bwa index -a bwtsw -b 375000000 $REFERENCE
indexes the human genome is 19 hours and the RAM usage tops-out at 100GB.
Although, I could close the issue, I would like to keep it open as I'm curious to know under what conditions the following lines of the bwa documentation hold true:
Indexing the human genome sequences takes 3 hours with bwtsw algorithm.
With bwtsw algorithm, 5GB memory is required for indexing the complete human genome sequences.
Never ever use human toplevel fasta files. See http://lh3.github.io/2017/11/13/which-human-reference-genome-to-use.
Issue
Based on the BWA documentation (source):
However, BWA indexing of the human genome, on our machine, hasn't been completed even after 2-days of runtime.
Before proceeding with much longer runtimes, could you please tell me if this issue makes sense?
Commands related to issue
I have tried the following commands for indexing:
The reference has been used either in its
.gz
compressed format or uncompressed.Each command was executed by SLURM with the following options:
General info
BWA version:
BWA installation:
Unmasked human reference used:
Operating system: