wwood / CoverM

Read coverage calculator for metagenomics
GNU General Public License v3.0
311 stars 31 forks source link

are reverse reads considered on this alignment? #154

Open Valentin-Bio opened 1 year ago

Valentin-Bio commented 1 year ago

Hello I used coverM to calculate relative abundance of MAGs across metagneomic libraries with the following parameters:

coverm genome -1 _1.fastq.gz -2 _2.fastq.gz -d MAGs -x fa --dereplicate --dereplication-ani 99 --min-read-aligned-length 70 --min-read-percent-identity 100 --min-read-aligned-percent 30 -m relative_abundance -o relabundances_93.tsv -t 20 --dereplication-precluster-method finch

The point is that the stdout message is giving me information about only the forward reads. e.g:

[2023-01-11T04:24:41Z INFO coverm::genome] In sample 'CL_FP.BAC4A_ATATCTCG-ACTAAGAT_L00M_1.fastq.gz', found 23822271 reads mapped out of 75696118 total (31.47%) [2023-01-11T04:27:01Z INFO coverm::genome] In sample 'CL_FP.BAC4B_GCGCTCTA-GTCGGAGC_L00M_1.fastq.gz', found 7506315 reads mapped out of 40260872 total (18.64%) [2023-01-11T04:32:04Z INFO coverm::genome] In sample 'CL_FP.BAC4C_AACAGGTT-CTTGGTAT_L00M_1.fastq.gz', found 29361998 reads mapped out of 82479500 total (35.60%) [2023-01-11T04:33:28Z INFO coverm::genome] In sample 'CL_FP.BAC4D_GGTGAACC-TCCAACGC_L00M_1.fastq.gz', found 4064164 reads mapped out of 21942570 total (18.52%) [2023-01-11T04:35:36Z INFO coverm::genome] In sample 'CL_FP.BAC4E_CAACAATG-CCGTGAAG_L00M_1.fastq.gz', found 7364041 reads mapped out of 35409422 total (20.80%) [2023-01-11T04:38:27Z INFO coverm::genome] In sample 'CL_FP.BAC4F_TGGTGGCA-TTACAGGA_L00M_1.fastq.gz', found 13279143 reads mapped out of 45900506 total (28.93%) [2023-01-11T04:41:05Z INFO coverm::genome] In sample 'CL_FP.BAC4G_AGGCAGAG-GGCATTCT_L00M_1.fastq.gz', found 12903024 reads mapped out of 40320576 total (32.00%) [2023-01-11T04:43:19Z INFO coverm::genome] In sample 'CL_FP.BAC4H_GAATGAGA-AATGCCTC_L00M_1.fastq.gz', found 8915514 reads mapped out of 38362732 total (23.24%) [2023-01-11T04:44:11Z INFO coverm::genome] In sample 'CL_FP.BAC4I_TGCGGCGT-TACCGAGG_L00M_1.fastq.gz', found 2598806 reads mapped out of 10028190 total (25.92%) [2023-01-11T04:46:46Z INFO coverm::genome] In sample 'CL_FP.BAC4J_CATAATAC-CGTTAGAA_L00M_1.fastq.gz', found 9401822 reads mapped out of 41132958 total (22.86%) [2023-01-11T04:48:09Z INFO coverm::genome] In sample 'CL_FP.BAC4K_GATCTATC-AGCCTCAT_L00M_1.fastq.gz', found 5334614 reads mapped out of 19856808 total (26.87%) [2023-01-11T04:49:36Z INFO coverm::genome] In sample 'CL_FP.BAC4L_AGCTCGCT-GATTCTGC_L00M_1.fastq.gz', found 5270880 reads mapped out of 23021598 total (22.90%)

are the reverse reads being mapped too?

if it helps, im using minimap2 aligner.

rhysnewell commented 1 year ago

The sample name is just derived from the file name of the forward reads. Both the forward and reverse reads are being mapped, each line is displaying information for both forward and reverse.

Rhys