Closed LonnekeNouwen closed 3 years ago
Dear Lonneke, sorry for the late reply. We recently improved the functions documentation and the vignette/tutorial is also updated. (preparing a Bioconductor submission)
please note that the previous tutorial version was using CPLEX optimization. It has to be downloaded from IBM (https://www.ibm.com/academic/technology/data-science - free for academics ). The current tutorial runs with 'lpsolve' , but it's capacity to solve this type of optimization problems is limited. Also we found a bug in a dependency that further limits lpsolve, which will be fixed next week.
regarding your questions:
best, Attila
Hi Attila, Thanks for your response! As said before, I would like to use a metabolomics and a phosphoproteomics dataset with this method. For the metabolomics datasets I understand from all the information that I need to make a pubchemnamed vector out of this metabolomics data. I dont understand, however, how I can use the phosphoproteomics data (because I cannot just use the names of the proteins, I also need to include the site of phosphorylation somehow. I read somewhere that I need to use the phosphoproteomics data to get signaling information. Is this correct and if so, how do I do this? I was also wondering how to include fluxomics data. The metabolomics dataset that I want to use is actually also a fluxomics dataset. But similar to the phosphoproteomics dataset, I dont know what format this data needs to be in and how to get there. Thanks for your help!
Best, Lonneke
Hi Lonneke,
@adugourd will jump in to answer this :)
I will come back to you shortly, preparing a short tutorial
Hey Lonneke,
sorry for the delayed answer. This seems to be a common point that people struggle with. Thus I made a mini script to show in parallele how to estimate TF activity from transcriptomic (which I think you know how to do now) and kinase activity from transcriptomic. you can find all of this here: https://github.com/saezlab/kinase_tf_mini_tuto
Please let me know if that already helps you with your problem.
Cheers,
Aurelien
For the fluxomic, you can actually use it without much trouble instead of metabolomic. The only thing that may be a bit complicated is that you will have to map your fluxes to their corresponding metaoblic enzymes, and then map those enzyem to their corresponding identifiers in the prior knowledge network.
Hi all,
Thanks for developing this package! I'm a Molecular Biologist with basic skills in BioInfo and R and as Lonneke I still don't totally get how to proper use your package. In my case I'm most interested in which datatype you exactly mean with "cellular/genetic perturbations"? Do you integrate e.g. drug treatment data ? How would it be enough to run these together with transcriptomic data (as you say two out of these 5) to get your causal networks?
Greets, marefei
Hi,
thanks for your interest!
Basically in the preprocess_COSMOS_signaling_to_metabolism function, you can give known perturbations as input to the signaling_data paramater.
If you check the tutorial, you can see that the argument signalling of the function is used with a names vector of TF and kinases with there corresponding activities. Instead, you can pass a named vector with the names of the pertubed nodes and 1 or -1 if they are up or down-regulated. For example if you have data with MAPK1 KO then you pass it a named vector with MAPK1 as name and -1 as value.
then you can pass TF activities estimated from transcriptomic instead of the "metabolic_data" parameter. The anme doesn't fit obvisouly but that's just because it was used with metabolomic data originally. We will change that in futur version.
I would also recommend to check the objects that are used in the tutorial, it might also help understading better how to pass the relevant data as inputs to the functions.
Hope that helps !
Cheers,
Aurelien
Dear sir/madame
Recently, I came across your paper describing the COSMOS method and I would like to try to use this method for my own data. Being trained as a biomedical scientist and not a bioinformatician, there were some questions that I could not answer myself, hence this post. My first question is whether it is nescessary to filter the datasets for, for instance, significance or another threshold or whether it is also possible to use the complete datasets (metabolomics data is sometimes more analysed based on trends than on significance)? I also wondered whether it is possible to use this method with different timepoints/groups (since we have different groups in our datasets)? And lastly, I could not figure out how to use this method with only two datasets. I tried to remove one dataset from the short COSMOS tutorial on github as a test, but the code did not work anymore. Therefore, my last question is how to use COSMOS with only a phosphoproteomics and metabolomics dataset. I would like to thank you in advance for your time and I hope to hear from you soon.
Kind regards, Lonneke Nouwen