liguowang / CrossMap

CrossMap is a python program to lift over genome coordinates from one genome version to another.
https://crossmap.readthedocs.io/en/latest/
Other
64 stars 23 forks source link

CrossMap VCF KeyError: '>' #43

Closed ew367 closed 2 years ago

ew367 commented 2 years ago

Hi

I am trying to run crossMap on a set of imputed VCF files (human). The files are split into individual chromosomes. For some of chromosome files, crossMap works fine, but on around half of them I am getting an error saying:

2022-02-15 03:31:17 [INFO] Lifting over ...

"Traceback (most recent call last): File "mydir/.conda/envs/crossMap/bin/CrossMap.py", line 319, in crossmap_vcf_file(mapping = mapTree, infile= in_file, outfile = out_file, liftoverfile = chain_fis.compression, cstyle = args.cstyle) File "mydir.conda/envs/crossMap/lib/python3.8/site-packages/cmmodule/mapvcf.py", line fields[4] = revcomp_DNA(fields[4], True) File "mydir/.conda/envs/crossMap/lib/python3.8/site-packages/cmmodule/utils.py", line return ''.join([complement[base] for base in reversed(seq)]) File "mydir/.conda/envs/crossMap/lib/python3.8/site-packages/cmmodule/utils.py", line return ''.join([complement[base] for base in reversed(seq)]) KeyError: '>'

Any advice on how I can solve this would be great, thanks!

liguowang commented 2 years ago

I guess there are some non-standard characters int the 5th field (i.e., the alternative allele filed) of your VCF file. Please check

You can have these characters in the 5th field: 'A','C','G','T','Y','R','S','W','K','M','B','V','D','H','N','.','*'. Multiple characters can be separated by ",".