saezlab / cosmosR

COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets.
https://saezlab.github.io/cosmosR/
GNU General Public License v3.0
56 stars 15 forks source link

Running COSMOS with transcriptomics and phosphoproteomics #18

Closed Eirinits closed 2 years ago

Eirinits commented 2 years ago

Hi,

I am interested in running COSMOS with transcriptomics and phosphoproteomics data and I was wondering about the following: 1 - In the original use case of COSMOS, the PKN required a specific format for the metabolomics data. Are the similar requirments for other types of data as well? 2 - My data come from time course experiments. I could run COSMOS separately for each time point and then compare the networks, but what if I would like to use COSMOS to connect data types coming from different time points? For example, if I want to connect an earlier timepoint of phosphoproteomics to a later time point of transcriptomics; Would it be possible to set a constrain that considers only the effect of one omics level to the other? In the example I gave, we would essentially omit the influence of transcriptomics to the phosphoproteomics level. 3 - How does COSMOS deal with multiplicity in phosphoproteomics data? Is it similar to how PHONEMeS merges the observations?

Thank you in advance for your help!

gabora commented 2 years ago

Hi @Eirinits sorry for the delayed answer.

  1. the names in the data inputs should match the names in the prior knowledge network. So it depends which prior knowledge network you use. In the built-in PKN the (phospho)-proteomics is identified by HGNC symbols.
  2. Usually, when we have 2 different type of omics data at the same timepoint, we run cosmos "forward" and "backward", so to connect one layer to the other and vice versa. In your case, I would run it only from phosphoproteomics (kinase activity) to transcriptomics (TF activity)
  3. Cosmos does not work directly with phospho data. The problem with phosphoproteomics is that we dont know the effect of a phosphorylation in many cases (activation or inhibition or maybe no function). So from a phosphorylation of a protein we dont if that protein is activated or not. However, we can learn the activity of the kinase which targets that protein (check our tool Kinact). The principle is that the more targets of a kinase is phosphorylated, the more likely it is active. Actually we have a small tutorial around this issue, checkout this link: https://saezlab.github.io/kinase_tf_mini_tuto/

best, Attila

joonan30 commented 2 years ago

Thanks for the nice tool. I have similar issue on this and tried COSMOS using transcriptome and phosphorylation datasets.

I gave phosphorylation data for metabolic_data in preprocess_COSMOS_signaling_to_metabolism and meta_network is the combination of dorothea_hs, PHONEMeS::phonemesPKN and PHONEMeS::phonemesKSN.

However, there's an error for preprocess_COSMOS_signaling_to_metabolism. It removed all phosphorylation in the source and target column. I think it comes from this function (none of conditions met to phosphorylation sites): https://rdrr.io/github/saezlab/COSMOS/src/R/filter_pkn_expressed_genes.R

Would you be able to update the function to take phosphorylation? or do you think using phospho-proteomics for COSMOS is not valid? I checked your comment above as well as using PHONEMeS, which works really well but I just want to see the relationship between TF and kinase consequences on phosphorylation.

Thanks in advance,

adugourd commented 2 years ago

Hi Joonan,

Actually COSMOS is meant to be used to connect TF activities with KINASE activities (estimated from phosphoproteomic data, using for example https://cran.r-project.org/web/packages/KSEAapp/vignettes/Overview.html), not directly phosphoproteomic.

Cheers,

Aurelien