Closed DarioS closed 5 years ago
This is expected behaviour. GRIDSS reports all variants in breakend (BND
) notation. Given the lack of a 1-to-1 mapping between events and breakpoints (e.g. inversions have two breakpoints, lots of deleted introns is extremely likely to actually be a retrocopied processed gene insertion and not actually deletions), and the misuse of the symbolic SV alleles by other callers (e.g. popular callers such as pindel, delly, manta are all non-compliant with the VCF specs) I chose to report in the VCF notation that makes it clear exactly what is actually being asserted by GRIDSS.
I am currently championing changes to the VCF (v4.4) specifications that makes this separation of breakpoint identification and event classification explicit. Until then, DUP-like events can be identified by breakpoint in which both breakends are on the same chromosome, and the two breakend orientations face away from each other.
If you don't care about actually correctly identifying complex events, the script in examples/simple-event-annotation.R can be used to convert to bed and annotate with the the DEL
, DUP
classifications that you're likely to be familiar with.
Duplicate of #74
Alright, I'll use the R package StructuralVariantAnnotation. I hope that your suggestions are incorporated into VCF format version 4.4.
I have completed variant calling on one sample and noticed that all of the variants have
SVTYPE=BND
I looked at the options of CallVariants, but there doesn't seem to be any option to modify the types of variants output, so perhaps something unusual happened during the analysis. There is a 19 bases long tandem duplication in an exon which I've found using another software, so I know there are other kinds of variants in this sample. I can send files if it helps to troubleshoot.