CityUHK-CompBio / DeepCC

DeepCC: a novel deep learning-based framework for cancer molecular subtype classification
https://CityUHK-CompBio.github.io/DeepCC/
MIT License
20 stars 16 forks source link

Transformation of expression data to functional spectra #15

Closed kate-simonova closed 1 year ago

kate-simonova commented 2 years ago

I would like to ask what type of expression data should be given to getFunctionalSpectra function. Is it normalized counts, TPM/RPKM or log2 transformed counts?

The other question is as soon as I understand GSEA is performed on differentially expressed genes. If I put gene expression data to getFunctionalSpectra function how exactly expression data are transformed to functional spectra without a knowledge of lfc and padj and groups compared?

Do you have any advice on preprocessing of RNA-seq data (normalization etc.)? Some publicly available sets on DeepCC website are raw Affymetrix CEL files.

kate-simonova commented 2 years ago

I found the answer to how functional spectrum is generated in DeepCC paper. However I come with a different question do you think the DeepCC model would work if I would feed it with Nanostring data (only 750 genes), you mention in the paper that there is no matter either microarray or RNA-seq data are used, but is it able to handle with Nanostring data?

gaofeng21cn commented 2 years ago

Hi Kate,

Nanostring will work since DeepCC is platform-independent. The problem is if you use only 750 genes, the performance will drop to ~80% (I guess). We will release our newly developed Graph Attention Network base tool GrandCC soon, which has better performance dealing with missing genes.

kate-simonova commented 2 years ago

Thanks for the response. Would you be nice to respond on my questions regarding to preprocessing of the raw data before loading them to getFunctional spectra function?

Kind regards,

Kate

kate-simonova commented 2 years ago

I am sorry for asking so many questions, however, if I understood it well I can use function getFunctionalSpectra for some other indications than those mentioned in the paper (CRC, Breast, Gastric, Ovarian)? Because these are just predefined genesets that are associated with a certain function?

zero19970 commented 1 year ago

Hi kate,

getFunctionalSpectra can be used on any type of mRNA data with TPM log2 transformed count. It was calculated based on the genesets collected in MsigDB v7.