hall-lab / svtyper

Bayesian genotyper for structural variants
MIT License
125 stars 55 forks source link

SVTYPE=INS #39

Closed zeeev closed 7 years ago

zeeev commented 7 years ago

Hi @cc2qe,

I've split a VCF into many pieces, some are finishing just fine, others contain some kinda of call that is throwing the following error.

It looks INS is genotyping in most cases? Is this accidental? It runs fine without INS.

File "/net/eichler/vol8/home/zevk/tools/svtyper/svtyper", line 1519, in sys.exit(main()) File "/net/eichler/vol8/home/zevk/tools/svtyper/svtyper", line 1514, in main args.dump) File "/net/eichler/vol8/home/zevk/tools/svtyper/svtyper", line 1273, in sv_genotype if o1_is_reverse: posA += 1 UnboundLocalError: local variable 'o1_is_reverse' referenced before assignment Error in job genotype while creating output file split_calls/split_bh.genotyped.vcf. RuleException: CalledProcessError in line 12 of /net/eichler/vol24/projects/structural_variation/nobackups/zevk/wham/genotype/Snakefile: Command '

cc2qe commented 7 years ago

SVtyper doesn't currently support insertion calls. We don't identify them with LUMPY so we have ignored them so far.

How are you guys annotating INS in the VCF? It's not quite clear to me the best way to enable insertion genotyping, but it would require knowledge of the insertion length or its sequence or both.

For now I've just modified svtyper to skip VCF lines with unknown variant types so it doesn't crash (https://github.com/hall-lab/svtyper/commit/eb950034061e4996314a4d1fd144dbbdefb0fe7d)

zeeev commented 7 years ago

@cc2qe ,

Thanks for skipping unknown types.

Here is an example of a WHAMG insertion call. I'm not estimating length or the inserted sequence.

chr1    1442939 .       T       <INS>   .       .       SVTYPE=INS;END=1442939;