Illumina / manta

Structural variant and indel caller for mapped sequencing data
GNU General Public License v3.0
404 stars 154 forks source link

python errors running convertInversion.py #219

Closed mooreann closed 4 years ago

mooreann commented 4 years ago

I've been trying to run the python file to convert manta 1.6 outputs back to 1.4 to deal with inversions. I've been running it in a python2 environment to ensure there are no versioning errors and have gotten some vcfs to convert while others keep getting the following error despite doing the exact same thing and manually checking the files that got errors and finding no issues:

Traceback (most recent call last): File "convertInversion.py", line 291, in invMateDict = scanVcf(vcfFile) File "convertInversion.py", line 108, in scanVcf vcfRec.checkInversion() File "convertInversion.py", line 73, in checkInversion getMateInfo(']') File "convertInversion.py", line 65, in getMateInfo [self.mateChr] = items[1].split(':') ValueError: too many values to unpack

Is this another versioning thing or has anyone seen this before?

apaul7 commented 4 years ago

I ran into the same error as reported in this issue. I found that this error is caused when a breakpoint is located on a chromosome that contains a : in the name. I can think of 2 of solutions. First is to just remove the problematic entries from the input VCF. Second is to modify the convertInversion.py

I modified the python script starting at line 62 to line 66 https://github.com/Illumina/manta/blob/75b5c38d4fcd2f6961197b28a41eb61856f2d976/src/python/libexec/convertInversion.py#L62

        def getMateInfo(splitChar):
            items = self.alt.split(splitChar)
            itemsOneSplit = items[1].split(':')
            if len(itemsOneSplit)>2:
                matePos = itemsOneSplit[len(itemsOneSplit)-1]
                self.mateChr = ':'.join(itemsOneSplit[:len(itemsOneSplit)-2])
            else:
                [self.mateChr, matePos] = items[1].split(':')
            self.matePos = int(matePos)

This may not be the most efficient but it seems to be working for me!