uclahs-cds / pipeline-call-sSNV

A Nextflow pipeline to identify the somatic single nucleotide variants (sSNVs) by comparing a pair of tumor/normal samples.
https://uclahs-cds.github.io/pipeline-call-sSNV/
GNU General Public License v2.0
5 stars 0 forks source link

add panel of normals #293

Closed zhuchcn closed 5 months ago

zhuchcn commented 5 months ago

Description

Add panel_of_normals_vcf to MuTect2.

There is a PON included in the GATK best practice resource bundle (see here for more information), which was created from 1K genome WGS. CCLE's initial analysis used MuTect1, and a PON that they created from ~8000 TCGA normals and filtered using their own algorithm (not confirmed but my guess is MuTect1 doesn't support PON). But in their latest release (23Q4), CCLE switched to using MuTect2 and also this 1K genome PON from GATK. With the addition of PON, 3000 - 7000 SNVs were filtered out, and the results of our pipeline aligned well with CCLE's latest release.

Closes #...

Testing Results

Checklist

zhuchcn commented 5 months ago

Yeah, the pipeline output of the 6 cell lines with the use of PON are here: /hot/project/method/AlgorithmDevelopment/ALGO-000074-moPepGen/CCLE/processed/WXS/PON_1KG/metapipeline-submit-pipeline-0.1.0/main_workflow/call-sSNV-8.0.0

For CCLE's SNV:

tyamaguchi-ucla commented 5 months ago

Anything else to add @yashpatel6 ?