uclahs-cds / project-method-AlgorithmEvaluation-BNCH-000082-SRCRNDSeed

GNU General Public License v2.0
1 stars 0 forks source link

Understand R script for parsing PhyloWGS output + Interprete PhyloWGS output #93

Closed philsteinberg closed 1 year ago

philsteinberg commented 1 year ago

output path : /hot/project/method/AlgorithmEvaluation/BNCH-000082-SRCRNDSeed/pipeline-call-src/run-mutect2-battenberg-phylowgs/output/call-SRC-1.0.0-rc.1/ILHNLNEV000001-T001-P01-F/PhyloWGS-2205be1/output/

output files (seed 366306, sample ILHNLNEV000001-T001-P01-F):

philsteinberg commented 1 year ago

Paper notes:

"For each patient we considered what fraction of their biomarker risk-score was derived from CNAs present in the trunk of the tumor and what fraction was derived from subclonal CNAs present in the most aggressive clone"

reconstruct tumor’s subclonal profile by identifying somatic SNVs and CNAs and then integrating these on evolutionary trees.

Trunk mutations distinguish all cancerous cells from the normal population, while branch mutations develop after clonal establishment. Most somatic point mutations occur in a tumor’s trunk. CNAs are equally split between the trunk and its branches. CCF measures the proportion of the tumor population that harbors the mutation.

The JSON results parsed to determine the best consensus tree, given by the largest log likelihood value, and the SNVs and CNAs associated with each subclone in the best predicted structure hierarchized by subclone cellular prevalence. Select for largest log likelihood (smallest negative normlaized log likelihood).

Root node (normal population) labeled as 0, each consensus tree was transformed according to two rules:

Clustering - Each node had at least 5 SNVs or 5 CNAs. Nodes that did not satisfy the criteria were: