BenLangmead / bowtie2

A fast and sensitive gapped read aligner
GNU General Public License v3.0
639 stars 159 forks source link

reads aligned concordantly exactly 1 time #395

Open lissur opened 2 years ago

lissur commented 2 years ago

Dear all,

How can I get the reads aligned concordantly exactly 1 time reported by bowtie2?

For example,

72235868 reads; of these: 72235868 (100.00%) were paired; of these: 1504901 (2.08%) aligned concordantly 0 times 58588359 (81.11%) aligned concordantly exactly 1 time 12142608 (16.81%) aligned concordantly >1 times

1504901 pairs aligned concordantly 0 times; of these:
  723834 (48.10%) aligned discordantly 1 time
----
781067 pairs aligned 0 times concordantly or discordantly; of these:
  1562134 mates make up the pairs; of these:
    914104 (58.52%) aligned 0 times
    278999 (17.86%) aligned exactly 1 time
    369031 (23.62%) aligned >1 times

99.37% overall alignment rate

how can I extract the 58588359 reads subset from the bam file generated by bowtie2?

Thank you,

Lissur.

ch4rr0 commented 1 year ago

Hello,

You can try adding the following to your command line: --no-unal: filter unaligned reads --no-discordant: allow on concordant reads

Then pipe the output to grep -v 'XS:i' that should get rid of alignments where bowtie2 found a secondary alignment. It may not be foolproof, but it is a start.

lissur commented 1 year ago

Hello,

Some months after and I still have not found a way to get the aligned concordantly exactly 1 time reads subset...

This is the bowtie2 report for the same sample with added --no-unal and --no-discordant parameters:

72235868 reads; of these: 72235868 (100.00%) were paired; of these: 1504901 (2.08%) aligned concordantly 0 times 58588359 (81.11%) aligned concordantly exactly 1 time 12142608 (16.81%) aligned concordantly >1 times

1504901 pairs aligned 0 times concordantly or discordantly; of these:
  3009802 mates make up the pairs; of these:
    914104 (30.37%) aligned 0 times
    1726667 (57.37%) aligned exactly 1 time
    369031 (12.26%) aligned >1 times

99.37% overall alignment rate

Then, I just used the grep -v "XS:i" and wc -l to check:

samtools view WT_1_uniquelyMapped.bam | grep -v "XS:i:" | wc -l

I would expect 58,588,359 x 2 = 117,176,718, but I got 115,013,825.

Maybe I'm missing something?

Thank you,

Lissur.