Closed cruckert closed 9 years ago
This is most likely due to an incompatibility of the header of the sequence reads. Can you copy and paste a few of the reads from your bam file?
These were Illumina MiSeq reads aligned with BWA-mem.
M03544:28:000000000-AH780:1:1101:17959:6059 163 1 45794774 60 151M = 45794957 333 GGGCGTGGTGGCTCATGCCTGTAATCCAAGCACTTTAGGAGGCTGAAGTGGGAGAATCACTTGAGGCCAGAATCACCTGAGTTCGAAACCAGTCTGAGCAACATAGCGAGACCCCCATCTCAAAAAAATACAACATAAAATAACTACAAAA ABABCBCB?CAGGHHGJIEGIFGGFHFHGIHHHHGGFIGGIFHHIGGIEIGGGHFGGHHHHGIGIGHDHIFGGIHIFIIHIEGHCGGFGFIIEHHIGIIHGIHGFIICCFFGDGDDCGGHHGGFGFFFGFHIGHHGGFGHGFGHHFGEDAF MC:Z:150M BD:Z:LLLLMMNMLNMLLLMNNNMKMLMKKMMLJKMMIMKEJLMMNNLLMNMLMNMLMNMMLMMJNLLNNNLMMNMMLMMJMKMNNMLMMMMDLMMNMMLMNNNNKLMNLLNMMNMMMKKKMNMLLMKDDDDDLLMMKLMNLLDDLLLLNLMMKDD MD:Z:151 PG:Z:MarkDuplicates RG:Z:000000000-AH780.1 BI:Z:RRNRRRSRSSRRSRSSSSRSSSSRSSSSSRSRSSSNSSRRRRRSSSQRSSRNRRRQSSSSSSSSRRRRSSRQSSSSSSSSRSSSRRQJSSSSSSSSSRSRSSSSRSSRRRRRSRRRSSSSRSSJJJJJSRSSSSSSRRJJSRRSSRSSSJJ NM:i:0 MQ:i:60 AS:i:151 XS:i:20 M03544:28:000000000-AH780:1:1101:22222:7198 163 1 45794774 60 151M = 45795071 448 GGGCGTGGTGGCTCATGCCTGTAATCCAAGCACTTTAGGAGGCTGAAGTGGGAGAATCACTTGAGGCCAGAATCACCTGAGTTCGAAACCAGTCTGAGCAACATAGCGAGACCCCCATCTCAAAAAAATACAACATAAAATAACTACAAAA 2>>BCBBB?AAGGHHGJIEGIFGGGHFHFIHHHHGGFIGGIFHHIFGIEIGFGHFGGHHHHGIGHGHDHIGFGIHIFIIHIEGHCFGFGFIIEGHIGIIHGIGGFIICFHFGDCDDGFHHHHGCGGFFGFHGGHHGEFGHFEGGHFFEDDG MC:Z:151M BD:Z:LLLLMMNMLNMLLLMNNNMKMLMKKMMLJKMMIMKEJLMMNNLLMNMLMNMLMNMMLMMJNLLNNNLMMNMMLMMJMKMNNMLMMMMDLMMNMMLMNNNNKLMNLLNMMNMMMKKKMNMLLMKDDDDDLLMMKLMNLLDDLLLLNLMMKDD MD:Z:151 PG:Z:MarkDuplicates RG:Z:000000000-AH780.1 BI:Z:RRNRRRSRSSRRSRSSSSRSSSSRSSSSSRSRSSSNSSRRRRRSSSQRSSRNRRRQSSSSSSSSRRRRSSRQSSSSSSSSRSSSRRQJSSSSSSSSSRSRSSSSRSSRRRRRSRRRSSSSRSSJJJJJSRSSSSSSRRJJSRRSSRSSSJJ NM:i:0 MQ:i:60 AS:i:151 XS:i:20
The header is no longer parse, so these formatting issues should not be a problem. Please pull down the latest master and try it. Thanks for trying out the program and reporting the bug.
Still getting same error in v 0.0.6
Error:
Traceback (most recent call last):
File "/home/blade4/ToolKit/BreaKmer-0.0.6/breakmer.py", line 96, in
Input : Illumina HiSeq reads aligned with BWA-mem, converted to bam, indexed, sorted and realigned bam.
When trying to run BreaKmer I get the following error:
Traceback (most recent call last): File "/opt/BreaKmer_v.0.0.2/breakmer.py", line 71, in
RUN_TRACKER.run()
File "/opt/BreaKmer_v.0.0.2/breakmer/processor/analysis.py", line 143, in run
aggResults = analyze_targets(targetAnalysisList)
File "/opt/BreaKmer_v.0.0.2/breakmer/processor/analysis.py", line 79, in analyze_targets
if not targetRegion.find_sv_reads(): # No SV reads extracted. Exiting.
File "/opt/BreaKmer_v.0.0.2/breakmer/processor/target.py", line 551, in find_sv_reads
if not self.clean_reads('sv'): # Check if there are any reads left to analyze after cleaning.
File "/opt/BreaKmer_v.0.0.2/breakmer/processor/target.py", line 584, in clean_reads
return self.variation.clean_reads(self.paths['data'], self.name, sampleType)
File "/opt/BreaKmer_v.0.0.2/breakmer/processor/target.py", line 226, in clean_reads
self.files['%s_cleaned_fq' % sampleType], self.cleaned_read_recs[sampleType] = utils.get_fastq_reads(self.files['%s_cleaned_fq' % sampleType], self.get_sv_reads(sampleType))
File "/opt/BreaKmer_v.0.0.2/breakmer/utils.py", line 529, in get_fastq_reads
for header, seq, qual in FastqFile(fn):
File "/opt/BreaKmer_v.0.0.2/breakmer/utils.py", line 837, in next
inst, lane, tile, x, y_end = header.split(':')
ValueError: too many values to unpack
I am guessing it has something to do with my input files.