saezlab / cosmosR

COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets.
https://saezlab.github.io/cosmosR/
GNU General Public License v3.0
56 stars 15 forks source link

Inquiry on Gene ID Handling and Recognition in COSMOS Analysis by PKN #36

Closed transform66 closed 2 months ago

transform66 commented 3 months ago

Dear Aurelien Dugourd, I hope this message finds you well. I am writing to seek your guidance on a few issues I've encountered while attempting to analyze my data using the COSMOS framework. As a fellow researcher utilizing this powerful tool, I'm eager to ensure that my analysis is conducted accurately and efficiently. Issue 1: Necessity of NCI60 Tutorial Preprocessing Firstly, I am unsure if I am obligated to follow the preprocessing steps outlined in the NCI60 tutorial(https://saezlab.github.io/cosmosR/articles/NCI60_tutorial.html) when preparing my data for COSMOS analysis. My concern stems from the specific nature of my proteomic dataset and whether it aligns seamlessly with the tutorial's guidelines. Could you please clarify if adhering strictly to the NCI60 tutorial preprocessing is a prerequisite or if there's flexibility in adapting the workflow to suit my unique dataset? Issue 2: Acquisition of cosmos_inputs.RData Assuming that the NCI60 tutorial preprocessing is indeed necessary, I've come across a reference to in the documentation, which seems to be a crucial input file. However, I'm unable to locate this file within the provided resources or the COSMOS GitHub repository. Could you kindly direct me to where I can access or if there's an alternative method to generate a similar input file tailored to my dataset? Issue 3: Gene ID Recognition in PKN Furthermore, if I opt not to follow the NCI60 tutorial, I've encountered a significant challenge where a substantial portion of my proteomic dataset's gene IDs are not recognized by PKN, thereby hindering the smooth execution of COSMOS. To address this issue, could you suggest any strategies or tools that might assist in converting or annotating my gene IDs in a manner that enhances their recognition by PKN? Alternatively, are there any known limitations or compatibility issues with specific types of gene IDs that I should be aware of? Your insights and recommendations would be invaluable in overcoming these obstacles and ensuring the success of my COSMOS analysis. Please let me know if there's any additional information I can provide to facilitate your assistance. Thank you very much for your time and expertise. I look forward to your response.

cosmos

adugourd commented 3 months ago

Hi Guangyuan Liu,

Thank you very much for your interest in our tool.

I clarified the markdown of the cosmoR GitHub to make it easier to find what you are looking for.

For 1 and 2) you can find everything you are looking for here: https://github.com/saezlab/COSMOS_basic (input data and script that detail the pre-processing and where to obtain the raw source material)

For 3, the PKN is annotated for the signalling part using gene symbols. As long as you use HUGO gene symbols, there should be no mapping problem. Of course there will always be genes that are not part of our PKN, since it’s not exhaustive, but a good proportion should be there.

Cheers,

Aurelien

transform66 commented 2 months ago

Dear Dr. Dugourd,

 

I hope this message finds you well.

 

I am writing to express our keen interest in utilizing the COSMOS framework for our multi-omics data analysis. We have carefully studied your previous suggestions and have made several attempts to implement them, yet we have encountered a few persistent issues that we have not been able to resolve.

 

We would greatly appreciate your guidance in addressing the following two main concerns:

 

  1. Data Import Format and Code Clarification: We are uncertain about the specific format required for data import and the corresponding code to be used. Could you please provide clarity on this?

 

  1. Complete Analysis Workflow Code: We are also unsure about the full analysis process code. Is there a sequential relationship between COSMOS_basic and COSMOS R, or is additional code required?

 

We believe that COSMOS_basic serves as a script for data preprocessing, and the "cosmos_inputs.Rdata" file is a merged file of multiple omics data, rather than a universal database. Should we preprocess our data to create a similar merged file (like "cosmos_inputs.Rdata") and then import it into this script? If that is the case, could you please specify what code we should use to create this merged file and to conduct the subsequent complete analysis?

 

Could you please provide us with precise advice regarding the correct file input format and the complete code for constructing the network analysis using the meta_network?

 

We are eager to successfully apply COSMOS to our research and any assistance you could provide would be invaluable. We are looking forward to your prompt response.

 

Thank you very much for your time and consideration.

 

Best regards,

 

Guangyuan Liu Shijiazhuang, Hebei Province, China

Hebei Medical University

鹿行川_?? @.***

 

------------------ 原始邮件 ------------------ 发件人: "saezlab/cosmosR" @.>; 发送时间: 2024年7月15日(星期一) 下午5:42 @.>; @.**@.>; 主题: Re: [saezlab/cosmosR] Inquiry on Gene ID Handling and Recognition in COSMOS Analysis by PKN (Issue #36)

Hi Guangyuan Liu,

Thank you very much for your interest in our tool.

I clarified the markdown of the cosmoR GitHub to make it easier to find what you are looking for.

For 1 and 2) you can find everything you are looking for here: https://github.com/saezlab/COSMOS_basic (input data and script that detail the pre-processing and where to obtain the raw source material)

For 3, the PKN is annotated for the signalling part using gene symbols. As long as you use HUGO gene symbols, there should be no mapping problem. Of course there will always be genes that are not part of our PKN, since it’s not exhaustive, but a good proportion should be there.

Cheers,

Aurelien

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

adugourd commented 2 months ago

Hi,

all these information can be accessed by simply loading the input objects provided in https://github.com/saezlab/COSMOS_basic and seeing for yourself :)

Cheers,

Aurelien

transform66 commented 2 months ago

Subject: COSMOS Application for Mouse Brain Proteomics and DIA-NN Data

Dear Dr. Dugourd,

I've noticed that the COSMOS example data is specific to the NCI60 cell line. I'm reaching out to ask if COSMOS can be used to analyze proteomics data from mouse brain tissue. Additionally, is it possible to import DIA-NN search results into COSMOS for analysis?

Thank you for your help.

Best, Guangyuan Liu

 

鹿行川_?? @.***

 

------------------ 原始邮件 ------------------ 发件人: "Aurelien @.>; 发送时间: 2024年8月15日(星期四) 下午5:03 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [saezlab/cosmosR] Inquiry on Gene ID Handling and Recognition in COSMOS Analysis by PKN (Issue #36)

Hi,

all these information can be accessed by simply loading the input objects provided in https://github.com/saezlab/COSMOS_basic and seeing for yourself :)

Cheers,

Aurelien

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

adugourd commented 2 months ago

You would need to format it accordingly, but yes it is possible.

transform66 commented 1 month ago

Dear Aurelien Dugourd,

I hope this message finds you well. I'm reaching out for some guidance on a data formatting task I'm working on.

I have a list of metabolites with their corresponding HMDB IDs, and I need to create unique IDs for each metabolite in the format (such as:Metab_HMDB****_c ). I'm unsure how to efficiently match and integrate this information.

Questions:

Unique ID Generation: Can you suggest a quick way to generate these unique IDs in R ? 

Tissue Localization Matching: How can I efficiently match the metabolite IDs from my main dataset with the tissue localization data? Would a merge or join operation in R be the best approach?

I'm looking for high-level advice on these tasks, as I'm confident I can implement the solutions with some guidance.

Thank you very much for your time and help. I appreciate any insights you can offer.

Best regards,

鹿行川_?? @.***

 

------------------ 原始邮件 ------------------ 发件人: "saezlab/cosmosR" @.>; 发送时间: 2024年8月30日(星期五) 晚上8:08 @.>; @.**@.>; 主题: Re: [saezlab/cosmosR] Inquiry on Gene ID Handling and Recognition in COSMOS Analysis by PKN (Issue #36)

You would need to format it accordingly, but yes it is possible.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

adugourd commented 1 month ago

hi,

see this line in the tutorial (https://saezlab.github.io/cosmosR/articles/NCI60_tutorial.html) :

Choose which compartment to assign to the metabolic measurments

metab_input <- prepare_metab_inputs(metab_input, c("c","m"))