Longo-Lab / de_dashboards

https://longo-lab.github.io/
0 stars 0 forks source link

Add TMT-Resilience proteomics set to the dashboard #13

Closed rbutleriii closed 10 months ago

rbutleriii commented 11 months ago

We need to generate the overlap enrichment plot for resilience associated proteomics modules in this paper:

https://doi.org/10.1016/j.mcpro.2023.100542

cyouh95 commented 11 months ago

"A total of 39 co-expression modules (M1-M39) were defined, ranging in size from 36 members (M39) as the smallest and 473 members (M1) as the largest (Fig. 2A and supplemental Tables S3 and S4)"

"Four modules were identified as significantly enriched with proteins conferring greater cognitive resilience: M22 Synapse, M5 Synapse, M36 Exocytosis, and M30 Mitochondria/ER (Fig. 3A and supplemental Table S10). In addition, four modules were found to be significantly enriched for proteins conferring less cognitive resilience: M11 Proteosome, M15 MAPK signaling, M32 GPCR signaling, and M16 Gluconeogenesis (Supplemental Table S11).

Next steps:

  1. Using tables S3 and S4, map protein ID to gene ID
    • Is there a conversion file to use? (e.g., use mappings from reference/biodomains/TMT-LP.Mouse.Genes.txt?)
  2. Generate overlap plot showing all 39 modules, highlighting the 4 greater cognitive resilience and 4 less cognitive resilience modules
    • Separate plots for BA6 and BA37 regions if different?
rbutleriii commented 11 months ago

Not a table, we can use biomaRt. If you check in the plasma_proteomics folder I mapped the SomaLogic data to genes, I believe using gene/protein symbols. But Uniprot IDs can also work. Read through the scripts in the baseline folder to ger a general idea of how that proteomics analysis goes. I think it is script 03 that does the conversion and fixing of proteins. Note that in some cases for SomaLogic, there are measures of a protein complex, that has multiple subunits thay are each a gene (GeneA|GeneB). That may also be true for Seyfried's TMT data.

Different than orthologs, protein subunits should all be present, and each should be counted (the equivalent of duplicate measures for gene orthologs). However we don't know the absolute ratios other than the previously defined structure of the complex (i.e. A tetramer with two copies of each subunit), so we just have to ascribe the measured value to each subunit, knowing one might be a little higher in quantity by mass because it's bigger, etc.

rbutleriii commented 11 months ago

For 2, we could do separate plots. BA6 tends to be thw more commonly examined, but if we have both plus TMT-AD, we might try to split them out into a proteomics tab versus a transcriptomics tab if there get to be too many buttons across the top. Although technically TREAT-AD is multiomics. But we are shifting that to the GSEA plot.

cyouh95 commented 10 months ago
  1. Looks like each row corresponds to only 1 gene/protein (i.e., GeneSymbol|ProteinSymbol). Used biomaRt to add Ensembl gene ID (see reference/biodomains/TMT/save_TMT_data.R).
    • All proteins are unique, but some genes may belong to multiple proteins (i.e., GeneA|ProteinA and GeneA|ProteinB) and as a result multiple modules
    • From biomaRt, each gene symbol may also map to multiple ensembl gene ID's (keep all mappings)
    • Thus, there may be duplicate gene ID to modules mappings as noticed previously for the TMT data, which are eliminated here
    • Looks like the module mappings are the same for BA6 and BA37
  2. BA6 & BA37 plots are added here, but they are the same (see above)
    • Separated into Transcriptomics enrichment tab (Treat-AD, Mostafavi, Milind, Wan) and Proteomics enrichment (Tmt-AD, BA6, BA37)
rbutleriii commented 10 months ago

Go ahead and combine BA6/BA37 resilience into one button since they are the same.

cyouh95 commented 10 months ago

Combined into BA6/BA37 resilience tab

rbutleriii commented 10 months ago

Great Job!