ccgd-profile / BreaKmer

A method to identify structural variation from sequencing data in target regions
31 stars 11 forks source link

ValueError: too many values to unpack #13

Closed cruckert closed 9 years ago

cruckert commented 9 years ago

When trying to run BreaKmer I get the following error:

Traceback (most recent call last): File "/opt/BreaKmer_v.0.0.2/breakmer.py", line 71, in RUN_TRACKER.run() File "/opt/BreaKmer_v.0.0.2/breakmer/processor/analysis.py", line 143, in run aggResults = analyze_targets(targetAnalysisList) File "/opt/BreaKmer_v.0.0.2/breakmer/processor/analysis.py", line 79, in analyze_targets if not targetRegion.find_sv_reads(): # No SV reads extracted. Exiting. File "/opt/BreaKmer_v.0.0.2/breakmer/processor/target.py", line 551, in find_sv_reads if not self.clean_reads('sv'): # Check if there are any reads left to analyze after cleaning. File "/opt/BreaKmer_v.0.0.2/breakmer/processor/target.py", line 584, in clean_reads return self.variation.clean_reads(self.paths['data'], self.name, sampleType) File "/opt/BreaKmer_v.0.0.2/breakmer/processor/target.py", line 226, in clean_reads self.files['%s_cleaned_fq' % sampleType], self.cleaned_read_recs[sampleType] = utils.get_fastq_reads(self.files['%s_cleaned_fq' % sampleType], self.get_sv_reads(sampleType)) File "/opt/BreaKmer_v.0.0.2/breakmer/utils.py", line 529, in get_fastq_reads for header, seq, qual in FastqFile(fn): File "/opt/BreaKmer_v.0.0.2/breakmer/utils.py", line 837, in next inst, lane, tile, x, y_end = header.split(':') ValueError: too many values to unpack

I am guessing it has something to do with my input files.

ryanabo commented 9 years ago

This is most likely due to an incompatibility of the header of the sequence reads. Can you copy and paste a few of the reads from your bam file?

cruckert commented 9 years ago

These were Illumina MiSeq reads aligned with BWA-mem.

M03544:28:000000000-AH780:1:1101:17959:6059 163 1 45794774 60 151M = 45794957 333 GGGCGTGGTGGCTCATGCCTGTAATCCAAGCACTTTAGGAGGCTGAAGTGGGAGAATCACTTGAGGCCAGAATCACCTGAGTTCGAAACCAGTCTGAGCAACATAGCGAGACCCCCATCTCAAAAAAATACAACATAAAATAACTACAAAA ABABCBCB?CAGGHHGJIEGIFGGFHFHGIHHHHGGFIGGIFHHIGGIEIGGGHFGGHHHHGIGIGHDHIFGGIHIFIIHIEGHCGGFGFIIEHHIGIIHGIHGFIICCFFGDGDDCGGHHGGFGFFFGFHIGHHGGFGHGFGHHFGEDAF MC:Z:150M BD:Z:LLLLMMNMLNMLLLMNNNMKMLMKKMMLJKMMIMKEJLMMNNLLMNMLMNMLMNMMLMMJNLLNNNLMMNMMLMMJMKMNNMLMMMMDLMMNMMLMNNNNKLMNLLNMMNMMMKKKMNMLLMKDDDDDLLMMKLMNLLDDLLLLNLMMKDD MD:Z:151 PG:Z:MarkDuplicates RG:Z:000000000-AH780.1 BI:Z:RRNRRRSRSSRRSRSSSSRSSSSRSSSSSRSRSSSNSSRRRRRSSSQRSSRNRRRQSSSSSSSSRRRRSSRQSSSSSSSSRSSSRRQJSSSSSSSSSRSRSSSSRSSRRRRRSRRRSSSSRSSJJJJJSRSSSSSSRRJJSRRSSRSSSJJ NM:i:0 MQ:i:60 AS:i:151 XS:i:20 M03544:28:000000000-AH780:1:1101:22222:7198 163 1 45794774 60 151M = 45795071 448 GGGCGTGGTGGCTCATGCCTGTAATCCAAGCACTTTAGGAGGCTGAAGTGGGAGAATCACTTGAGGCCAGAATCACCTGAGTTCGAAACCAGTCTGAGCAACATAGCGAGACCCCCATCTCAAAAAAATACAACATAAAATAACTACAAAA 2>>BCBBB?AAGGHHGJIEGIFGGGHFHFIHHHHGGFIGGIFHHIFGIEIGFGHFGGHHHHGIGHGHDHIGFGIHIFIIHIEGHCFGFGFIIEGHIGIIHGIGGFIICFHFGDCDDGFHHHHGCGGFFGFHGGHHGEFGHFEGGHFFEDDG MC:Z:151M BD:Z:LLLLMMNMLNMLLLMNNNMKMLMKKMMLJKMMIMKEJLMMNNLLMNMLMNMLMNMMLMMJNLLNNNLMMNMMLMMJMKMNNMLMMMMDLMMNMMLMNNNNKLMNLLNMMNMMMKKKMNMLLMKDDDDDLLMMKLMNLLDDLLLLNLMMKDD MD:Z:151 PG:Z:MarkDuplicates RG:Z:000000000-AH780.1 BI:Z:RRNRRRSRSSRRSRSSSSRSSSSRSSSSSRSRSSSNSSRRRRRSSSQRSSRNRRRQSSSSSSSSRRRRSSRQSSSSSSSSRSSSRRQJSSSSSSSSSRSRSSSSRSSRRRRRSRRRSSSSRSSJJJJJSRSSSSSSRRJJSRRSSRSSSJJ NM:i:0 MQ:i:60 AS:i:151 XS:i:20

ryanabo commented 9 years ago

The header is no longer parse, so these formatting issues should not be a problem. Please pull down the latest master and try it. Thanks for trying out the program and reporting the bug.

Shruti-Desai commented 7 years ago

Still getting same error in v 0.0.6 Error: Traceback (most recent call last): File "/home/blade4/ToolKit/BreaKmer-0.0.6/breakmer.py", line 96, in r.run(tic) File "/home/blade4/ToolKit/BreaKmer-0.0.6/sv_processor.py", line 191, in run if not trgt.clean_reads() : # Clean soft-clipped and reads that did not map together or at all File "/home/blade4/ToolKit/BreaKmer-0.0.6/sv_processor.py", line 601, in clean_reads self.files['cleaned_fq'], self.cleaned_read_recs, self.read_len = get_fastq_reads(self.files['cleaned_fq'], self.sv_reads) File "/home/blade4/ToolKit/BreaKmer-0.0.6/utils.py", line 210, in get_fastq_reads for header,seq,qual in FastqFile(fn) : File "/home/blade4/ToolKit/BreaKmer-0.0.6/utils.py", line 703, in next inst,lane,tile,x,y_end = header.split(':') ValueError: too many values to unpack

Input : Illumina HiSeq reads aligned with BWA-mem, converted to bam, indexed, sorted and realigned bam.