cgplab / PAMES

Tool to estimate purity of tumor samples exploiting DNA Methylation data
GNU General Public License v3.0
10 stars 8 forks source link

Normal samples required? #10

Open MjelleLab opened 9 months ago

MjelleLab commented 9 months ago

When running PAMES::get_purity(beta) it seems to require normal samples beta values and the AUC-file. Is it possible to run it using only beta-values from the tumor sample?

Best,

romagnolid commented 9 months ago

Hi, you can run get_purity using a pre-generated set of informative CpG sites. Here https://github.com/cgplab/PAMESdata you can find different sets for 14 tumor types

Otherwise, normal samples are required to generate the AUC file and use find_informative_sites.

MjelleLab commented 9 months ago

Thanks @romagnolid Unfortunately I need brain (LGG) form TCGA. I have access to TCGA myself and found the *level3betas.txt files. How would you go about creating a normal profile based on these files? Are you using blood-normal or tissue-normal?

romagnolid commented 9 months ago

I checked and LGG have very few control samples available on TCGA but you can retreive some data from GEO or from EWAS datahub.

Next it's just three steps

library(PAMES)

N=20
auc_vector <- get_AUC(tumor_data, control_data, cores=N)
info_sites <- find_informative_sites(tumor_data,
                                     control_data,
                                     auc_vector,
                                     illumina450k_hg38, cores=N)
purity_data <- get_purity(tumor_data, info_sites)
romagnolid commented 9 months ago

Tissue-normal is required as control. You can either download all the samples and create a matrix of beta-values (bind each level3betas.txt column-wise) or use a package such as TCGABiolinks.