Open anusurendra opened 7 years ago
Hi, what is your question?
Apologizes Brent for not being clear,
When I was looking at the flagstat, I just found that the secondary alignment values weird. It seems that there are large portion have more than one alignment. But it may have to do with the fact that the sample has both wheat and rust. Thanks again for your patience and help. As a side note, I found BWA-meth is great and has better alignments than Bismark!!!
I think this is due to the issue I just posted, #50
BWA sets as QC-failed when reads are soft-clipped, see https://bioinformatics.stackexchange.com/questions/2988/interpreting-0x200-flag-in-bwa-mem-alignments
I ran BWA without the -p flag but with the other flags and had no QC failed alignments.
Actually it seems this is due to lines 361-390 in the master script.
Hi, For DNA methylation analysis, I used bwa-meth to align reads from poppy samples to the poppy genome. When I run samtools flagstat I get the following output. There are fairly high QC-failed reads. How can I reduce QC-failed reads? Thanks. @brentp
36809078 + 31632003 in total (QC-passed reads + QC-failed reads) 26158 + 118693 secondary 0 + 0 supplementary 0 + 0 duplicates 34068600 + 31567323 mapped (92.55% : 99.80%) 36782920 + 31513310 paired in sequencing 18391460 + 15756655 read1 18391460 + 15756655 read2 33895914 + 0 properly paired (92.15% : 0.00%) 34013748 + 31383950 with itself and mate mapped 28694 + 64680 singletons (0.08% : 0.21%) 66884 + 108786 with mate mapped to a different chr 18024 + 44290 with mate mapped to a different chr (mapQ>=5)
Hi, you can try the --do-not-penalize-chimeras
argument.
Hİ, Thank you for your suggestion. I can try the --do-not-penalize-chimeras argument. QC-failed reads reduced. I am adding the same sample samtools flagstat output below. I am doing analysis with multiple samples. QC fail reads were quite high on 3 of my samples compared to my other samples. Does the --do-not-penalize-chimeras argument affect the accuracy of my data? What effect does it have? Should I use the --do-not-penalize-chimeras argument in my other samples as well? Can you advice? @brentp Thanks
68441081 + 0 in total (QC-passed reads + QC-failed reads) 144851 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 65635923 + 0 mapped (95.90% : N/A) 68296230 + 0 paired in sequencing 34148115 + 0 read1 34148115 + 0 read2 65152402 + 0 properly paired (95.40% : N/A) 65397698 + 0 with itself and mate mapped 93374 + 0 singletons (0.14% : N/A) 175670 + 0 with mate mapped to a different chr 68445 + 0 with mate mapped to a different chr (mapQ>=5)
Hi, without --do-not-penalize-chimeras
, then bwameth does this:
https://github.com/brentp/bwa-meth/blob/7c8e1eb2a3b17b1bc6ecbef62119bf422d131830/bwameth.py#L436-L444
so it find reads where there is a lot of soft-clipping. This can happen with BS-Seq and usually leads to lower quality results. You'll have to decide for yourself when to use the argument.
Hi, I used bwa-meth to align rust infected wheat samples to the rust genome. However, when I ran samtools flagstat I get the follow output:
I didn't know BWA-meth gave secondary alignments. Also, the total number of QC-passed reads + QC-failed reads (27480347) is greater than the total of the pair-end reads (26463278).