Closed blindner6 closed 2 years ago
Hi Blake,
bwa-mem2 usage was put in the code (#72 ) but not as part of a release yet. Are you using 0.6.1 or newer? Thanks.
Hey Ben,
Thanks for your reply. Sorry for any confusion -- I'm not attempting to run bwa-mem2 within CoverM.
I am running CoverM v0.6.1. I've used bwa v0.7.17 to index my genome database. But CoverM won't accept that via -r
(path and stem for the index files being provided are correct).
I have hundreds of metagenomes I want to map to thousands of MAGs, so I want to avoid building the index with each pass. I could map those myself and then handle the BAM files into CoverM but wanted to check here first if -r
is indeed intended to allow me to provide my own bwa index (thus avoiding this). And if so, what I might be doing wrong above that is preventing me from doing so.
Thanks for your time!
Ah, OK.
My testing just below suggests coverm already picks up the stem of the DB as it is supposed to (this is for bwa-mem v0.7.17 / coverm 0.6.1 but the same held true for bwa-mem2 / coverm newest). Using 7seqs.fna from the tests/data directory.
(coverm-v0.6.1)cl5n008:20220506:~/git/coverm$ bwa index /tmp/7seqs.fna
[bwa_index] Pack FASTA... 0.00 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.01 seconds elapse.
[bwa_index] Update BWT... 0.00 sec
[bwa_index] Pack forward-only FASTA... 0.00 sec
[bwa_index] Construct SA from BWT and Occ... 0.00 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index /tmp/7seqs.fna
[main] Real time: 0.013 sec; CPU: 0.013 sec
(coverm-v0.6.1)cl5n008:20220506:~/git/coverm$ coverm contig -p bwa-mem -r /tmp/7seqs.fna --single tests/data/2seqs.fasta
[2022-05-05T19:37:07Z INFO coverm] CoverM version 0.6.1
[2022-05-05T19:37:07Z INFO coverm] Using min-covered-fraction 0%
[2022-05-05T19:37:07Z INFO bird_tool_utils::external_command_checker] Found samtools version 1.15.1
[2022-05-05T19:37:07Z INFO coverm::mapping_index_maintenance] BWA index appears to be complete, so going ahead and using it.
[2022-05-05T19:37:07Z INFO coverm::contig] In sample '7seqs.fna/2seqs.fasta', found 2 reads mapped out of 2 total (100.00%)
Contig 7seqs.fna/2seqs.fasta Mean
genome1~random_sequence_length_11000 0
genome1~random_sequence_length_11010 0
genome2~seq1 1
genome3~random_sequence_length_11001 0
genome4~random_sequence_length_11002 0
genome5~seq2 1
genome6~random_sequence_length_11003 0
Are you using it in the above way? ben
Hi Ben,
I've been encountering the same issue when using coverm with bwa (coverm version 0.6.1, bwa version 0.7.17).
When using a pre-built bwa index and TMPDIR
is set, coverm tries to build the index in the TMP directory without checking if the index is complete in the provided location.
[2022-06-13T15:45:11Z INFO coverm::mapping_index_maintenance] Generating BWA_MEM index for /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/representatives_deduplicated.mmi.fa .. [2022-06-13T15:45:11Z DEBUG coverm::mapping_index_maintenance] Running DB indexing command: "bwa" "index" "-p" "/home/fb343/scratch60/coverm-mapping-index.xJlBNZi30ad2/representatives_deduplicated.mmi.fa" "/home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/representatives_deduplicated.mmi.fa"
TMPDIR
has been set to /home/fb343/scratch60/
Hi @fbeghini I would be surprised if TMPDIR being set would change the behaviour. Can you let me know the command line you are using and an ls
of the directory containing the bwa index? Since two people have reported this now I imagine there is something untoward going on, but I cannot reproduce as shown in the comment just above.
Thanks, ben
Hi @wwood , The command line I'm using is
TMPDIR=/home/fb343/scratch60 coverm genome -t 20 -1 /home/fb343/project/HMBP/reads/021-0001-05/021-0001-05.bigtrim_1.fq.gz -2 /home/fb343/project/HMBP/reads/021-0001-05/021-0001-05.bigtrim_2.fq.gz -p bwa-mem -r /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.fa --genome-definition /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/contig_to_bin_map --no-zeros --min-read-aligned-percent 70 --min-read-percent-identity 90 --min-covered-fraction 10 --bam-file-cache-directory /home/fb343/christakis_loan/coverm/021-0001-05/ -o /home/fb343/christakis_loan/coverm/021-0001-05/021-0001-05.relab --verbose
The folder has just the files produced by bwa
$ ls /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/ /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.amb /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.bwt /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.pac /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.ann /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.fa /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded.sa
Hi,
In the call to coverm you specified the reference as repded.fa
but the root of the index is just repded
. In this case the reference should be specified to coverm as the root i.e. /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded
.
I guess you used bwa index
with -p
as something other than default?
Did I get that right? Thanks, ben
I tried first using just the root of the index but coverm failed to run giving
thread 'main' panicked at 'The reference specified '/home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded' does not appear to exist', src/mapping_parameters.rs:159:9
I found this post while looking if someone else was having the same issue and tried to run it by adding the file extension wondering if coverm would try to build the index after checking if all the index files are present.
bwa index
was run with -b 10000000000 -p repded
Ah, right. That is a slight bug then. To work around that, you can just create an empty file i.e. touch /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded
and then specify -r /home/fb343/christakis_loan/vamb/representatives_deduplicated_fnas/repded
to coverm.
I didn't quite understand that last part. Was there some reason that you needed to use -p
with bwa index
?
The empty file trick works, thanks. Also, no reason to use -p
, wanted to make sure that was not a problem of the file name when I was creating the index.
Good good. Closing this now since I guess the underlying issue has been solved for the next release.
Hello Ben,
Can you please update the crates.io page for CoverM first?
Thanks,
Jianshu
Hi Jianshu,
I consider issues as closed when they are fixed in the main branch, which will make it into the next release (here, the one after 0.6.1).
Having said that, a new release is quite overdue - hope to get to it soon.
Thanks ben, I can compile it now but for those who have less knowledge about compiling, they may need a new binary or something.
Jianshu
Hi Ben,
I've been competitively mapping reads to a collection of ~7,000 draft genomes using bwa-mem2. Recently, I've tried switching to coverm to complete both mapping (still with bwa-mem) and depth calculation (seqdepth calculation is much faster than the ad hoc python scripts I was using previously) for these genomes.
My longterm aim is to run multiple metagenomes through coverm but since my genome set is so large, I want to avoid building a new index each time I start the process over with a new metagenome. So I wanted to offer
coverm genome
my bwa index via-r
for each run. Based on reading the help documentation, I'm attempting to callcoverm genome
in this way:Where the variables follow: fwd/rev: forward and reverse gzipped FASTQ, index: path to a bwa-mem index, including the stem g_dir: path to directory containing genomes (suffix matches default .fna) output: what I want the output .tsv saved as
When I run this, coverm is not able to recognize my BWA index (passed with my index variable above) despite a lot of fiddling on my end. I see the following error before coverm exits:
Any thoughts or feedback would be greatly appreciated.
Cheers,
Blake