Closed wangjiawen2013 closed 3 years ago
Yes, Lancet reports somatic and germline variants always compared to the reference genome. The variant in your example would be reported by lancet as two separate events (on separate lines in the VCF):
A->C (normal) A->G (tumor)
The SnpEff documentation page that you shared has instructions on how to properly annotate the VCF.
Can lancet be used to call variants from case-control pairs or mutate-rescued pairs (where the mutation is resuced by crispr/cas9 gene editting), they are very similar to normal-somatic pairs.
If the methodological approach to call such variants is like the one for somatic, then I guess it could work. But we have not tested Lancet for such scenario. If you do, please let us know if it worked well.
I don't quite know about variants discovery. Are there any special considerations to call variants from normal-tumor pairs? I think we can call variants from normal sample and tumor sample separately using common germline variants caller, then substrate the normal variants form the tumor variants. I know this is probably wrong, but it's my naive thoughts
It's not wrong, but suboptimal and prone to more errors. The original Lancet paper explains the reasons.
Hi, Lancet generates in output the list of variants in VCF format (v4.1). All variants (SNVs and indels either shared, specific to the tumor, or specific to the normal) are exported in output. Following VCF conventions, high quality variants are flagged as PASS in the FILTER column. Does it mean somatic variants are defined as the variants specific to the tumor ? how to define the following case: reference genome is 'A', germline is 'C' and somatic is 'G'.
I am using snpeff to annotate the vcf. It is said that "It is common practice, to have all samples in a single "multi-sample VCF file" (having two or more separate VCF files is highly discouraged). This is also the "gold standard" in cancer analysis standard, so all samples (both somatic and germline) should be in one VCF file. In a typical cancer sequencing experiment, we want to measure and annotate differences between germline (healthy) and somatic (cancer) tissue samples from the same patient. The complication is that germline is not always the same as the reference genome, so a typical annotation does not work." This can be find here: https://pcingola.github.io/SnpEff/se_cansersamples/
For instance, let's assume that at a given genomic position (e.g. chr1:69091), reference genome is 'A', germline is 'C' and somatic is 'G'. This should be represented in a VCF file as:
So, will the germline always the same as the reference genome in lancet output ?