lh3 / bwa

Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
GNU General Public License v3.0
1.54k stars 556 forks source link

bwa index of 300GB multifasta fails at bwt2sa stage without error #436

Closed hazmup closed 1 month ago

hazmup commented 1 month ago

I have downloaded several thousand genomes from RefSeq into a multifasta file and I want to index it. The file is ~300GB. When I run nohup bwa index -b 100000000 combined_sequences.fna &> bwa_indexing.log & it fails at [bwa_index] Construct SA from BWT and Occ.... I switched from a 512GB RAM machine to a 1TB RAM machine and tried to resume with nohup bwa bwt2sa combined_sequences.fna.bwt combined_sequences.fna.sa &, however it failed again without any message at all. What could be wrong?

lh3 commented 1 month ago

Probably a memory issue. Try smaller batches.

hazmup commented 1 month ago

Do you mean that the bwt file is corrput and the first steps need to be repeated? Because I cannot find a batch parameter for bwa bwt2sa, and the construction of the bwt file did not fail. Thank you.

hazmup commented 1 month ago

It seems that bwa bwt2sa had not actually crashed, even though I am pretty sure it was not listed in the running processes. Probably an oversight on my part. In any case I am marking this as resolved, thank you @lh3 !