arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
306 stars 118 forks source link

bam from samtools view -F 1294 is larger than bam from speedseq or #369

Open liuhankui opened 2 years ago

liuhankui commented 2 years ago

samtools view -F 1294 includes flag 2048 (supplementary alignment),but samblaster --discordantFile did not include flag 2048 (supplementary alignment),as well as speedseq, which used samblaster --discordantFile too. Is there a problem?

By the way, bwa mem mark supplementary alignment(flag 2048) reads as not primary alignment(flag 256) when using -M,mybe there will be another problem for someone who use bwa mem -M

ryanlayer commented 2 years ago

The current flags are the best option. The original wording on that -M option was confusing and our understanding if it evolved. We settled on not using -M and using the current flags.

On Nov 21, 2021, at 6:28 PM, Vincent @.***> wrote:

 samtools view -F1294 includes flag 2048 (supplementary alignment),but samblaster --discordantFile did not include flag 2048 (supplementary alignment),as well as speedseq, which used samblaster --discordantFile too. Is there a problem?

By the way, bwa mem marked supplementary alignment(flag 2048) reads as not primary alignment(flag 256) when using -M,mybe there will be another problem for someone who use bwa mem -M

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

liuhankui commented 2 years ago

Many thanks for your suggestion.

I found different sizes of discordant bam files between samtools view -F 1294 and samblaster --discordantFile.

here is the commands:

bwa mem -R "@RG\tID:IDA\tPL:illumina\tPU:IDA\tLB:IDA\tSM:IDA\tCN:BGI" GRCh38.fa fq1.gz fq2.gz | samblaster --excludeDups --addMateTags --maxSplitCount 2 --minNonOverlap 20 -d samblaster.disc.sam -s samblaster.split.sam | samtools view -Sb -o sample.bam - samtools view -h -F 1294 sample.bam > samtools.disc.sam samtools view -h sample.bam | scripts/extractSplitReads_BwaMem -i stdin > samtools.split.sam

here is the differences: wc -l samblaster.disc.sam 121030 wc -l samtools.disc.sam 130582

wc -l samblaster.split.sam 34488 wc -l samtools.split.sam 36266

ryanlayer commented 2 years ago

That could just be samblaster removing duplicates.

On Nov 25, 2021, at 12:11 AM, Vincent @.***> wrote:

 Many thanks for your suggestion.

I found different sizes of discordant bam files between samtools view -F 1294 and samblaster --discordantFile.

here is the commands:

bwa mem -R @.***\tID:IDA\tPL:illumina\tPU:IDA\tLB:IDA\tSM:IDA\tCN:BGI" GRCh38.fa fq1.gz fq2.gz | samblaster --excludeDups --addMateTags --maxSplitCount 2 --minNonOverlap 20 -d samblaster.disc.sam -s samblaster.split.sam | samtools view -Sb -o sample.bam - samtools view -h -F 1294 sample.bam > samtools.disc.sam samtools view -h sample.bam | scripts/extractSplitReads_BwaMem -i stdin > samtools.split.sam

here is the differences: wc -l samblaster.disc.sam 121030 wc -l samtools.disc.sam 130582

wc -l samblaster.split.sam 34488 wc -l samtools.split.sam 36266

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.