broadinstitute / gatk-sv

A structural variation pipeline for short-read sequencing
BSD 3-Clause "New" or "Revised" License
160 stars 71 forks source link

MEIs in SplitVariants #649

Open epiercehoffman opened 4 months ago

epiercehoffman commented 4 months ago

svtk vcf2bed uses the ALT field to produce the svtype column in the output BED file. This means that the svtype column includes BND alt alleles and values like INS:ME for MEIs. However, the current and previous SplitVariants tasks in GenotypeBatch match exactly on the string "INS" when creating insertion-specific BED files, so the MEIs get grouped with BCAs instead. We should evaluate the impact of this on genotyping and whether MEIs should be grouped with other INS events instead.

jingydz commented 3 months ago

It appears that the file named 'svtypes.txt' can be defined by yourself.