Closed alhafidzhamdan closed 4 years ago
how do I generate Tumor_Varcount
This is the VF
field for breakpoints, and the BVF
field for single breakends.
GRIDSS reports total support in the INFO column, as well as a per-sample breakdown.
Tumor_Depth and Normal_Depth
These fields don't make sense to me as it is unclear what 'depth' is, nor why they only want 2 depths and not 4.
GRIDSS reports REF
and REFPAIR
which are the number of reference-supporting reads spanning the breakpoint, and the number of reference-supporting read pairs spanning the breakpoint. Again, GRIDSS reports totals in the INFO column, as well as a per-sample breakdown so for these you'll want a per-sample calculation for normal reads and the same calculation on the tumour reads.
The big issue I have is that those depths are not per-breakpoint. Depth is calculated as variant reads + reference reads. The problem is reference reads can be completely different on each side of the breakpoints. A breakpoints can be homozyous on one side and heterozygous on the other. I don't know what their depth fields actually mean. Maybe average it with (REF1+REF2+REFPAIR1+REFPAIR2)/2+VF? I'm dubious of how good the result of running the tool will be if the inputs are in a format that doesn't make sense.
Note that for small variants, REFPAIR read pairs may actually be from the variant haplotype. A 50bp DEL with a 500bp fragment will have reads containing the DEL that appear to still support the reference (e.g. a fragment that's actually 500bp long that look 550bp since it spans the DEL). If you don't account for this, you'll underestimate the tumour allele fraction for small events. libgridss.R
has some helper functions that accounts for this in the VAF reporting functions.
If you're wanting to do somatic SV interpretation and analysis, I strongly recommend trying our GRIDSS/PURPLE/LINX pipline (https://www.biorxiv.org/content/10.1101/781013v1) as the LINX output gives a much much more comprehensive picture of the somatic rearrangement landscape than any other tools currently available.
Alternatively, depth might be the read depth between the breakpoints for dup/del/inv events.
INV is also problematic as there are way more fold-back inversions than actual balanced (two-breakpoint) inversion in cancer genomes.
See https://github.com/hartwigmedical/hmftools/tree/master/sv-linx#key-concepts-in-linx for the model we use.
Thanks for this! Appreciate the caveat explanations. I share your concerns re: ambiguity in "depth" info. I have run GRIDSS-PURPLE-LINX but I don't think LINX provides clonality information of the breakends? Unless I'm missing something.
I don't think LINX provides clonality information of the breakends?
It does purity-adjusted VAF and infers total SV copy number but no, it doesn't explicitly estimate clonality. Anything with low inferred SV copy number is clearly subclonal but it won't tell you a CN=1 SV is clonal or a CN=2 at 50% clonality.
Hi there,
I am trying to run palimpsest https://github.com/FunGeST/Palimpsest. Their SV requirement is
I can use your handy simpleEventType() function to convert BND types. However, and I know this has been asked before in various forms in this forum, but it's better to ask directly from you rather than inferring it myself, how do I generate
Tumor_Varcount
,Tumor_Depth
andNormal_Depth
?Many thanks Al