Closed stsergbg closed 2 years ago
GRIDSS performs joint assembly and variant calling across all samples. The INFO
correspond to the aggregate across all the inputs. The majority of these fields have matching FORMAT
fields which provide the per-sample breakdown. The only ones that don't are field for which a per-sample breakdown doesn't make sense.
For example, there's no per sample breakdown of supporting assembly counts. Since the assembly is a joint assembly, a per-sample breakdown doesn't make sense because there was never a per-sample assembly. That said, GRIDSS does provide a per sample breakdown of the supporting reads that make up each of these assemblies. In the case of complex rearrangements where an assembly spans multiple SVs (e.g. an germline indel flanking a somatic breakpoint), GRIDSS will even pro-rata the assembly support across the length of the contig to ensure that the somatic breakpoint isn't called as germline due to the germline reads that support the germline indel part of the assembly.
Especially interesting are fields like CIPOS
Specifications-defined fields like CIPOS describe the variant itself. Whether the variant is present in the germline and/or the tumour depends on what the level of support for that variant is in that sample (which you can get from the corresponding FORMAT field).
GRIDSS also reports additional variant-level fields not (yet) defined in the VCF specifications (e.g. IHOMPOS).
Hi!
I am carrying out joint GRIDSS calling of a normal sample and several tumor samples. What I did not quite get yet is what do the INFO fields of variants represent - some average/median/min/max statistics? If so, are only tumor samples considered in the calculation? Especially interesting are fields like CIPOS.
Thank you in advance, Sergey