Open huangl07 opened 5 years ago
Please check on the read highlighted in the error msg: _std::exception::what: Empty read attributed to sequence fragment with QNAME: 'SRR5739122.82403596' anchored by mate alignment starting at (1-indexed) position: 'NC029263.1:19661908'
BTW, is this ticket a duplicate of #164?
while? how could I generate the bam file.
I use the samtools sort and index it
and I use the picard to MarkDuplicate
should I use the sort bam file only?
hear is my mapping script
bwa mem ../Genome/ref.fa SRR5739123_1.fastq.gz SRR5739123_2.fastq.gz -t 4 -a -M -R "@RG\tID:SRR5739123\tLG:SRR5739123\tLB:SRR5739123\tPL:illumina\tSM:SRR5739123\tPU:run_barcode\tCN:MajorBio\tDS:reseq" | samtools view -bS > SRR5739123.bam samtools sort SRR5739121.bam -o SRR5739121.sort.bam samtools markdup --reference ../Genome/ref.fa SRR5739121.sort.bam SRR5739121.mkdup.bam
well I think it's error maybe cause some low mapping quality reads or multimapping result
SRR5739119.25811182 401 NC_029256.1 1529566 0 125M NC_029259.1 5775201 0 * * NM:i:9 MD:Z:5G61C2A0C3A0A5G9G2A29 MC:Z:113M12AS:i:86 RG:Z:SRR5739119 SRR5739119.25811182 385 NC_029256.1 2009992 0 58H67M NC_029259.1 5775201 0 * * NM:i:1 MD:Z:61C5 MC:Z:113M12H AS:i:65 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 2198987 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C10G42 MC:Z:113M12H AS:i:111 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 2199076 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C7T45 MC:Z:113M12H AS:i:111 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 2199165 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C10G42 MC:Z:113M12H AS:i:111 RG:Z:SRR5739119 SRR5739119.8064323 401 NC_029256.1 6533440 0 69H56M NC_029264.1 7870164 0 * * NM:i:3 MD:Z:39G3T9C2 MC:Z:125M AS:i:43 RG:Z:SRR5739119 SRR5739119.25811182 385 NC_029256.1 8236472 0 18M1D107M NC_029259.1 5775201 0 * * NM:i:10 MD:Z:9T8^A5T8C9C5T3T0G3A61C5 MC:Z:113M12H AS:i:79 RG:Z:SRR5739119 SRR5739119.25811182 385 NC_029256.1 8606257 0 125M NC_029259.1 5775201 0 * * NM:i:5 MD:Z:42T10G0T2A61C5 MC:Z:113M12H AS:i:106 RG:Z:SRR5739119 SRR5739119.25811182 385 NC_029256.1 8606435 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:42T10G3A61C5 MC:Z:113M12H AS:i:111 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 10576656 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C10A42 MC:Z:113M12AS:i:111 RG:Z:SRR5739119 SRR5739119.25811182 385 NC_029256.1 11886598 0 6H119M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:36T10G3A61C5 MC:Z:113M12AS:i:105 RG:Z:SRR5739119 SRR5739119.25811182 385 NC_029256.1 11886681 0 125M NC_029259.1 5775201 0 * * NM:i:3 MD:Z:53G3A61C5 MC:Z:113M12H AS:i:116 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 12723735 0 125M NC_029259.1 5775201 0 * * NM:i:3 MD:Z:5G61T3C53 MC:Z:113M12H AS:i:116 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 12723824 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C11C41 MC:Z:113M12AS:i:111 RG:Z:SRR5739119 SRR5739119.8064323 401 NC_029256.1 13943026 0 40H15M4I66M NC_029264.1 7870164 0 * * NM:i:8 MD:Z:18G45G0G2G12 MC:Z:125M AS:i:51 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 15603395 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C29A23 MC:Z:113M12AS:i:111 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 15603573 0 125M NC_029259.1 5775201 0 * * NM:i:3 MD:Z:5G61T3C53 MC:Z:113M12H AS:i:116 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 15603662 0 125M NC_029259.1 5775201 0 * * NM:i:3 MD:Z:5G61T3C53 MC:Z:113M12H AS:i:116 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 15603751 0 125M NC_029259.1 5775201 0 * * NM:i:4 MD:Z:5G61T3C10G42 MC:Z:113M12AS:i:111 RG:Z:SRR5739119 SRR5739119.25811182 401 NC_029256.1 15603840 0 125M NC_029259.1 5775201 0 * * NM:i:5 MD:Z:5G61T3C10A19A22 MC:Z:113M12AS:i:106 RG:Z:SRR5739119
is there any one could solve it?
Among the reads you posted, I didn't see the one that has QNAME of 'SRR5739122.82403596' and is aligned to NC_029263.1 at 19661908. Pulling out that particular read will help understand why it triggered the error.
_std::exception::what: Empty read attributed to sequence fragment with QNAME: 'SRR5739122.82403596' anchored by mate alignment starting at (1-indexed) position: 'NC029263.1:19661908'
sorry bother that,but this reads also have error message
dear X-chen:
I have already reanalysis other datas with the script like:
1st step:
bwa mem -M -a -t 8 -R "@RG\tID:1\tLG:A\tLB:1\tPL:illumina\tSM:A\tPU:run_barcode\tCN:MajorBio DS:reseq" /mnt/ilustre/centos7users/dna/SV/02.reference/ref.fa /mnt/ilustre/centos7users/dna/SV/data/B1228nova5:L1ECL171065:MJ20181118001:ECL171065:A.clean.1.fastq.gz /mnt/ilustre/centos7users/dna/SV/data/B1228nova5:L1ECL171065:MJ20181118001:ECL171065:A.clean.2.fastq.gz| samtools view -bS - > /mnt/ilustre/centos7users/dna/SV/03.mapping/A.b1.bam bwa mem -M -a -t 8 -R "@RG\tID:2\tLG:B\tLB:1\tPL:illumina\tSM:B\tPU:run_barcode\tCN:MajorBio DS:reseq" /mnt/ilustre/centos7users/dna/SV/02.reference/ref.fa /mnt/ilustre/centos7users/dna/SV/data/B1228nova5:L1ECL171066:MJ20181118001:ECL171066:B.clean.1.fastq.gz /mnt/ilustre/centos7users/dna/SV/data/B1228nova5:L1ECL171066:MJ20181118001:ECL171066:B.clean.2.fastq.gz| samtools view -bS - > /mnt/ilustre/centos7users/dna/SV/03.mapping/B.b1.bam
2st step:
samtools merge -f -p -@ 8 --output-fmt BAM /mnt/ilustre/centos7users/dna/SV/04.sort/B.merged.bam /mnt/ilustre/centos7users/dna/SV/03.mapping/B.b1.bam && samtools sort -o /mnt/ilustre/centos7users/dna/SV/04.sort/B.sort.bam --output-fmt BAM -@ 8 /mnt/ilustre/centos7users/dna/SV/04.sort/B.merged.bam &&samtools index /mnt/ilustre/centos7users/dna/SV/04.sort/B.sort.bam samtools merge -f -p -@ 8 --output-fmt BAM /mnt/ilustre/centos7users/dna/SV/04.sort/A.merged.bam /mnt/ilustre/centos7users/dna/SV/03.mapping/A.b1.bam && samtools sort -o /mnt/ilustre/centos7users/dna/SV/04.sort/A.sort.bam --output-fmt BAM -@ 8 /mnt/ilustre/centos7users/dna/SV/04.sort/A.merged.bam &&samtools index /mnt/ilustre/centos7users/dna/SV/04.sort/A.sort.bam
and the
configManta.py --bam=04.sort/A.sort.bam --bam=04.sort/B.sort.bam --runDir=04.SV --referenceFasta=02.reference/ref.fa
running error is like the file
workflow.error.log.txt
but the first error head bam is
A00184:284:H7K5GDSXX:1:2624:31132:5650 385 chr1 20274924 0 77M74H chr10 4299198 0 * NM:i:7 MD:Z:2T11G4C2A9G25A3T14 MC:Z:151M AS:i:44 RG:Z:1 A00184:284:H7K5GDSXX:1:2624:31132:5650 337 chr1 31656906 0 40H111M = 32586388 929373 NM:i:8 MD:Z:13C5T6T10A1C4A1C16T47 MC:Z:86M65H AS:i:71 RG:Z:1 A00184:284:H7K5GDSXX:1:2624:31132:5650 401 chr1 32586301 0 65H69M17H chr10 4299198 0 NM:i:5 MD:Z:8A14A3T25C9T5 MC:Z:151M AS:i:44 RG:Z:1 A00184:284:H7K5GDSXX:1:2624:31132:5650 129 chr1 32586388 22 86M65S chr10 4299198 0 ACGTTTGTTTGAATAGAGAGTGTGGCTGACATATGGGCCCGGGTGGCATTTGGGAATGTAAAATTGGGAGAGTGGCAGTTGAGCACGGGGATGTTGAGTGAGTGGCTCTTCGTCAGTTGTCCCTCTGAGAGAAGATAATCCTTCGAGGGAG ,F,:FFFF:FF,FFF:FFF,FF,::::,F,FFFF,:FFFF::F::FFF,:FFFFFF:F,F:F,:FFFFF:FF,FFF:F::FFF,F:,F,,,:FFFF,F,F,F:FF:,,,F,F,,:,FF,:,,,F::F,,,:F,F,,,:,,F:,:,,:,FFF NM:i:5 MD:Z:2T19A35A3T14T8 MC:Z:151M AS:i:63 XS:i:53 RG:Z:1
thankyou!
I am not sure if the read you printed above is a single record or multiple records in the bam file? It looks the QNAME "A00184:284:H7K5GDSXX:1:2624:31132:5650" has 4 occurrences, and all being concatenated into a single line?
Looking the first occurrence (copied below), it has TLEN=0, SEQ=, and missing QUAL? _A00184:284:H7K5GDSXX:1:2624:31132:5650 385 chr1 20274924 0 77M74H chr10 4299198 0 NM:i:7 MD:Z:2T11G4C2A9G25A3T14 MC:Z:151M AS:i:44 RG:Z:1_ I think that's what Manta complained about.
Please make sure the input bam follows the spec https://samtools.github.io/hts-specs/SAMv1.pdf
e,the bam is generate by bwa mem -M -a -t 8 -R "@RG\tID:1\tLG:A\tLB:1\tPL:illumina\tSM:A\tPU:run_barcode\tCN:MajorBio DS:reseq" /mnt/ilustre/centos7users/dna/SV/02.reference/ref.fa /mnt/ilustre/centos7users/dna/SV/data/B1228nova5:L1ECL171065:MJ20181118001:ECL171065:A.clean.1.fastq.gz /mnt/ilustre/centos7users/dna/SV/data/B1228nova5:L1ECL171065:MJ20181118001:ECL171065:A.clean.2.fastq.gz| samtools view -bS - > /mnt/ilustre/centos7users/dna/SV/03.mapping/A.b1.bam samtools sort -o /mnt/ilustre/centos7users/dna/SV/04.sort/A.sort.bam --output-fmt BAM -@ 8 /mnt/ilustre/centos7users/dna/SV/04.sort/A.merged.bam
I didn't know how to fix it,could you please to figure this. cause GATK strekla could get the result
thank you!
the bam is multiple record
is it caused by the bwa -a parameter?
I will check it!
Hi chen, good news is I remapping the read to generate the bam file without bwa men -a parameter
but I can't understand why? could you show me some method to do after mapping down.
I face the same problem when I am making config file with --bam (single sample in my case) argument. But when I run the below mentioned command it works fine: configManta.py --tumorBam=sample.sort.bam --runDir=. --referenceFasta=ref.fa
@huangl07: Could you please comment on how to tackle this problem for diploid samples and how much different is the pipeline for diploid sample processing from tumor sample without normal sample.
dear Manta support:
I was used to analysis
configManta.py --bam=SRR5739119.mkdup.bam --bam=SRR5739120.mkdup.bam --bam=SRR5739121.mkdup.bam --bam=SRR5739122.mkdup.bam --bam=SRR5739123.mkdup.bam --referenceFasta=../Genome/ref.fa --runDir=SV
and it turns out Error like this: