I use bwa mem to align one sample twice , but the .bam files are different in the two repetitive tests.
Then I test for some other arguments as follow, but the output bam files are still different in two repetitive tests.
The command line that I have tested :
(1)bwa mem -t 20 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz
(2)bwa mem -t 20 -a -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -a )
(3)bwa mem -t 20 -K 10000000 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -K 10,000,000)
(4)bwa mem -t 20 -K 1000000000000 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -K 1,000,000,000,000)
(5)bwa mem -t 1 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -t 1 )
The different results are as follow:
the same readID(V350125465L1C006R06700311073) is located in chr22 in test1, but is located in chr20 in test2:
test1:
V350125465L1C006R06700311073 83 chr22 11577976 0 27M2I121M = 11577834 -290 ACACCGTGCCCTGGCTGGCAGGATGGGGAGAGGAGGGAGCGTGTCTGTTCACCTGGCCAGCCCTAGGCAACTCTGCAGAGAAAGACACAGGCACTTCCCCTCTGCAGCCAAAGAGTTAAGAAGGCTCGATGTGAAATGAATCATTCCAGG DIDCIGEGFHHEGGHEGGGDGHCFHGHGDHCHHCGHEDDHGEGEHEGFDHCHHFCGHHDGHIGECGHGCDIEGEGICHCHCCCHCHCHDGGICIEEGHHHEHEGGDHHHCDDGCFFFDDHDDHEHEIFDFGDHDDDDHDCDHDFFHIDHH NM:i:7 MD:Z:5A19A30T33G31T25 MC:Z:150M AS:i:115 XS:i:125 RG:Z:tumor
V350125465L1C006R06700311073 163 chr22 11577834 20 150M = 11577976 290 TGGTTATCTTTAGGTAGCAGAATTCAAGACTGCTTCTTTTTTCTTTTCTTCCTACTTGTATGTTATCTCTATTTCCCTGTGTGAGGATTTATGACTGTTGTGATGAAAAGGCTAGTATTCTAACTCCCTGCATCATAAGCACACACCGTG DHIDDFDIDDCFHHCFHHDHFFDDFEEHFHDHGCCFDCDDDDHDDDDIDDHHDFFCCHDFDGCDFCHCHDFCDDHHHDICHCIFHHEDCDFCICHDHDCGDHEDHEEFEHGHCFGDFDCFCFEHDDHHDHDEBHECEEHGFFFGEGHHCI NM:i:3 MD:Z:32A14T99A2 MC:Z:27M2I121M AS:i:137 XS:i:127 RG:Z:tumor
In my opinion: if a read can map to multiply locations, then bwa would chose one location as random in output bam file, but I am not sure. So, why is the two repetitive results are different?
And, most important, is there any argument can make the results identical in every repetitive test?
Hi,developer
I use bwa mem to align one sample twice , but the .bam files are different in the two repetitive tests. Then I test for some other arguments as follow, but the output bam files are still different in two repetitive tests. The command line that I have tested : (1)bwa mem -t 20 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz
(2)bwa mem -t 20 -a -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -a ) (3)bwa mem -t 20 -K 10000000 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -K 10,000,000) (4)bwa mem -t 20 -K 1000000000000 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -K 1,000,000,000,000) (5)bwa mem -t 1 -R "@RG\tID:tumor" -M ref.hg38 sample.R1.fq.gz sample.R2.fq.gz (add -t 1 )
The different results are as follow: the same readID(V350125465L1C006R06700311073) is located in chr22 in test1, but is located in chr20 in test2: test1: V350125465L1C006R06700311073 83 chr22 11577976 0 27M2I121M = 11577834 -290 ACACCGTGCCCTGGCTGGCAGGATGGGGAGAGGAGGGAGCGTGTCTGTTCACCTGGCCAGCCCTAGGCAACTCTGCAGAGAAAGACACAGGCACTTCCCCTCTGCAGCCAAAGAGTTAAGAAGGCTCGATGTGAAATGAATCATTCCAGG DIDCIGEGFHHEGGHEGGGDGHCFHGHGDHCHHCGHEDDHGEGEHEGFDHCHHFCGHHDGHIGECGHGCDIEGEGICHCHCCCHCHCHDGGICIEEGHHHEHEGGDHHHCDDGCFFFDDHDDHEHEIFDFGDHDDDDHDCDHDFFHIDHH NM:i:7 MD:Z:5A19A30T33G31T25 MC:Z:150M AS:i:115 XS:i:125 RG:Z:tumor V350125465L1C006R06700311073 163 chr22 11577834 20 150M = 11577976 290 TGGTTATCTTTAGGTAGCAGAATTCAAGACTGCTTCTTTTTTCTTTTCTTCCTACTTGTATGTTATCTCTATTTCCCTGTGTGAGGATTTATGACTGTTGTGATGAAAAGGCTAGTATTCTAACTCCCTGCATCATAAGCACACACCGTG DHIDDFDIDDCFHHCFHHDHFFDDFEEHFHDHGCCFDCDDDDHDDDDIDDHHDFFCCHDFDGCDFCHCHDFCDDHHHDICHCIFHHEDCDFCICHDHDCGDHEDHEEFEHGHCFGDFDCFCFEHDDHHDHDEBHECEEHGFFFGEGHHCI NM:i:3 MD:Z:32A14T99A2 MC:Z:27M2I121M AS:i:137 XS:i:127 RG:Z:tumor
test2: V350125465L1C006R06700311073 83 chr20 30872005 17 27M2I121M = 30871863 -290 ACACCGTGCCCTGGCTGGCAGGATGGGGAGAGGAGGGAGCGTGTCTGTTCACCTGGCCAGCCCTAGGCAACTCTGCAGAGAAAGACACAGGCACTTCCCCTCTGCAGCCAAAGAGTTAAGAAGGCTCGATGTGAAATGAATCATTCCAGG DIDCIGEGFHHEGGHEGGGDGHCFHGHGDHCHHCGHEDDHGEGEHEGFDHCHHFCGHHDGHIGECGHGCDIEGEGICHCHCCCHCHCHDGGICIEEGHHHEHEGGDHHHCDDGCFFFDDHDDHEHEIFDFGDHDDDDHDCDHDFFHIDHH NM:i:5 MD:Z:5A19A18T103 MC:Z:150M AS:i:125 XS:i:116 RG:Z:tumor V350125465L1C006R06700311073 163 chr20 30871863 0 150M = 30872005 290 TGGTTATCTTTAGGTAGCAGAATTCAAGACTGCTTCTTTTTTCTTTTCTTCCTACTTGTATGTTATCTCTATTTCCCTGTGTGAGGATTTATGACTGTTGTGATGAAAAGGCTAGTATTCTAACTCCCTGCATCATAAGCACACACCGTG DHIDDFDIDDCFHHCFHHDHFFDDFEEHFHDHGCCFDCDDDDHDDDDIDDHHDFFCCHDFDGCDFCHCHDFCDDHHHDICHCIFHHEDCDFCICHDHDCGDHEDHEEFEHGHCFGDFDCFCFEHDDHHDHDEBHECEEHGFFFGEGHHCI NM:i:5 MD:Z:32A14T67C2C28A2 MC:Z:27M2I121M AS:i:127 XS:i:137 RG:Z:tumor
In my opinion: if a read can map to multiply locations, then bwa would chose one location as random in output bam file, but I am not sure. So, why is the two repetitive results are different?
And, most important, is there any argument can make the results identical in every repetitive test?
thank you