broadinstitute / 2023_12_JUMP_data_only_vignettes

Collection of JUMP documentation and projects for internal and public consumption
1 stars 0 forks source link

Cluster GPR176, TSC22D1, DPAGT1, CHRM4: exploration for MorphMap paper (ORF+CRISPR) #15

Open AnneCarpenter opened 8 months ago

AnneCarpenter commented 8 months ago

This cluster was found in #7 as strong (+/-) correlation in both ORF and CRISPR but not (completely) strongly connected in the KG.

Looking across ORF and CRISPR plots there, there are a few other adjacent genes we should consider adding to this and then figure out a story for them (if needed, contacting a biologist who studies some subset of these genes).

Screenshot 2024-01-18 at 12 25 47 PM
tjetkaARD commented 8 months ago

Updating the above heatmaps as described in https://github.com/broadinstitute/2023_12_JUMP_data_only_vignettes/issues/7#issuecomment-1901252123

To include all mentioned genes - P-value replicability of CRISPR and Q-value replicability of ORF is shown below.

Interestingly, CDC42SE2, CDC25C could also belong to the cluster, while MYT1&MAP4K4 are consistently anti-correlated with the cluster.

Heatmaps of CRISPR and ORF similarity image

AnneCarpenter commented 7 months ago

Waiting for chromosome-arm-corrected CRISPR data from @zahrahanifehlou before proceeding (probably it will not change the results but we don't want to waste time in case it does!)

(we are also waiting for tools from @tjetkaARD to make heatmaps that include KG+ and KG- genes to provide more context when we are zooming into a cluster with some KG- connections in it.)

That said, @jessica-ewald and @Zitong-Chen-16 could check - if these connections are existing in the ORF-only data we can proceed at least with that part. But still, it's maybe more efficient to wait for CRISPR results because it may expand the gene list of interest.

niranjchandrasekaran commented 3 months ago

Notebook

The heatmap shows the percentile of the cosine similarities (1 → similar, 0 → anti-similar). The text is the maximum of the absolute KG score (gene_mf__go, gene_bp_go, gene_pathway). I set a KG threshold (like we previously had) of 0.4. If connections have a score lesser than this threshold, then the connection is considered to be unknown. The KG scores were downloaded from Google Drive: ORF and CRISPR. The diagonal of the heatmap indicates whether a gene has a phenotype (False could also mean the gene is not present in the dataset).

Most of these connections are unknown. But there is strong evidence in both ORF and CRISPR.

ORF

ORF-connections-CDC25C-CDC42SE2-CHRM4-DPAT1-GPR176-MAP4K4-MYT1-TSC22D1

CRISPR

CRISPR-connections-CDC25C-CDC42SE2-CHRM4-DPAT1-GPR176-MAP4K4-MYT1-TSC22D1

tjetkaARD commented 3 months ago

One of stories I see here (a bit speculative) is connected with cell-cycle process. E.g. oocyte maturation pathway:

image

Source: https://www.genome.jp/pathway/hsa04914+9088 or https://www.genome.jp/pathway/hsa04114+9088

AnneCarpenter commented 3 months ago

Agree, let's pursue this - let's be sure that we re-create the clusters based on what are the nearest neighbors of the genes involved rather than including genes just because they were in the original clusters with old profiles.

niranjchandrasekaran commented 1 month ago

Notebook

Here is the new cluster (ORFs) ORF-connections-CHRM4-GPR176-LY6K-LZTS2-MYT1-SCAPER-SLC22A14-TSC22D1

niranjchandrasekaran commented 1 month ago

Notebook

This connection is not affected by plate layout.

ORF-plate-layout-LY6K-GPR176-SCAPER-CHRM4-SLC22A14-TSC22D1-LZTS2-MYT1

AnneCarpenter commented 1 month ago

@niranjchandrasekaran I'm confused why there's only one, final cluster plot - this was supposed to be strong in ORF+CRISPR. Even if not, we'd like to have both visible for writing up the paragraph.

As well, it's hard to write without knowing how we chose these genes... can you trace how we came up with the final ones shown in your plot? I think this stemmed from looking at lists of strongest connections in either ORF or CRISPR, perhaps?

niranjchandrasekaran commented 1 month ago

Notebook 1, Notebook 2

I'm confused why there's only one, final cluster plot - this was supposed to be strong in ORF+CRISPR. Even if not, we'd like to have both visible for writing up the paragraph.

Looks like I forgot to plot the CRISPR connections. Here it is

CRISPR-connections-CHRM4-GPR176-GRK2-MAPKAPK2-MYT1-PPME1-SQLE-TSC22D1

CRISPR-plate-layout-GRK2-GPR176-PPME1-CHRM4-SQLE-TSC22D1-MAPKAPK2-MYT1

As well, it's hard to write without knowing how we chose these genes... can you trace how we came up with the final ones shown in your plot? I think this stemmed from looking at lists of strongest connections in either ORF or CRISPR, perhaps?

Tomasz had found that these were the genes that showed a strong connection in both ORF and CRISPR and were unknown : https://github.com/broadinstitute/2023_12_JUMP_data_only_vignettes/issues/7#issuecomment-1881054358. But this was using the old profiles. For the new profiles, I just expanded the cluster and removed the connections that are no longer present.

AnneCarpenter commented 1 month ago

Ok! Alán and I are finalizing the story, could we have versions that are just these 5 genes? We are dropping all the extras that don't appear in both ORF and CRISPR: GPR176, CHRM4, TSC22D1, MYT1, LZTS2

(we might drop LZTS2 depending how they look, so if it's quick to make the 4 gene and 5 gene version that's fine too)

niranjchandrasekaran commented 1 month ago

Notebook 1

LZTS2 is not present in the CRISPR dataset. Here are the remaining genes

ORF

ORF-connections-GPR176-CHRM4-TSC22D1-MYT1-LZTS2

CRISPR

CRISPR-connections-GPR176-CHRM4-TSC22D1-MYT1

jgaetz-plex commented 6 days ago

@AnneCarpenter Knock-down of Musashi-1 (MSI1) in SU_MB002 medulloblastoma cells has been reported to lead to down-regulation of GPR176, CHRM4, and TSC22D1 and up-regulation of MYT1 by RNA seq analysis, mirroring our observed correlations (Kameda-Smith et al). MSI1 may play an essential role in nervous system development (Kameda-Smith et al). Additionally, GPR176, CHRM4, and TSC22D1 are reported to be down-regulated in Alzheimer's Disease (AD) associated TREM2 variants (Liu et al JExpMed2020).

jgaetz-plex commented 6 days ago

Plex search with GPR176, CHRM4, TSC22D1, MYT1, LZTS2 (human+mouse+rat).

AnneCarpenter commented 6 days ago

Great, this paragraph has been finished and put in the main text. Presumably @niranjchandrasekaran needs to assemble the final figures into the main paper, so leaving it to him to close this issue when ready.

here's the draft text Connection of TSC22D1 to genes with neural function Across both ORF and CRISPR data, we identified three strongly correlated genes (GPR176, CHRM4, TSC22D1), and a fourth gene, MYT1, consistently negatively correlated to that group (Figure Xx). GPR176, CHRM4, and MYT1 are known to be involved in neural development or upregulated in neurons 57,58 and their relationship is already known, according to moderate to high scores in the knowledge graph, whereas TSC22D1 has little known connection to any of these and is instead annotated as involved in apoptosis, tumor suppression, and cellular stress response. The morphological similarity we observed indicates that TSC22D1 warrants further investigation for neural functions - indeed, it is most highly expressed in brain tissue according to the Human Protein Atlas (https://www.proteinatlas.org/ENSG00000102804-TSC22D1/tissue). Furthermore, TSC22D1 is strongly anti-correlated to LZTS2 in our ORF dataset (LZTS2 was not present in our CRISPR dataset), and the two genes have shown an inverse mRNA expression relationship 57,59. LZTS2’s role in the Wnt pathway 60 provides another tie to neuronal function 61. Using public datasets in the Plex web application, we found that knock-down of Musashi-1 (MSI1), a gene that may play an essential role in nervous system development, in SU_MB002 medulloblastoma cells has been reported to lead to down-regulation of GPR176, CHRM4, and TSC22D1 and up-regulation of MYT1 by RNA seq analysis, mirroring our observed correlations 62. Additionally, GPR176, CHRM4, and TSC22D1 are reported to be down-regulated in TREM2 variants associated with Alzheimer's Disease (AD) 63. Together, this data indicates investigating TSC22D1 in brain function would be worthwhile.