broadinstitute / 2023_12_JUMP_data_only_vignettes

Collection of JUMP documentation and projects for internal and public consumption
1 stars 1 forks source link

ZBTB16 & SLC39A1: exploration for MorphMap paper #12

Open AnneCarpenter opened 9 months ago

AnneCarpenter commented 9 months ago

From #7 we see that ZBTB16 & SLC39A1 are strongly anti-correlated in ORF data and CRISPR data, and it's novel (not seen in Evotec KG).

Zinc finger and BTB domain-containing protein 16 SLC: Zinc transporter

Tons of ppl study SLCs in general, but none came up for SLC39A1 specifically. So I went the ZBTB16 route instead. Here is a link to active NIH grants that have ZBTB16 in them if we need a backup: https://reporter.nih.gov/search/AAbEtq4H8Uyo8KnZoZOIoQ/projects?projects=Active

I emailed heechang@umich.edu today:

Hello, I found that you might study ZBTB16 based on your NIH grant "Regulation of metabolic pathways in NKT cells" (it came up in a search though ZBTB16 isn't in the abstract). I wonder if our recent results might spark a collaboration?

To summarize a lot of work, we knocked down and overexpressed genes one by one in U2OS human cells, and then clustered the genes based on having similar morphological impact (using the Cell Painting microscopy assay that labels major organelles).

We found ZBTB16 (Zinc finger transcription factor) has the opposite morphological impact as SLC39A1 (Zinc transporter), in both overexpression and CRISPR knockdown data. It seems this connection is novel?

If so, we would be delighted to work together if you'd like to design an experiment to followup/confirm the connection and add to a paper we are beginning to write up about the large dataset. It always helps in such papers to make a new discovery that can be confirmed (even if it's not a dramatic finding).

All the best, please let me know if you would like to talk further!

Cheers, Anne

AnneCarpenter commented 9 months ago

Her reply: "Hi Anne,

Thank you for reaching out to me. Your findings are interesting and zinc seems to be the link. Unfortunately, we no longer work with PLZF and terminated PLZF Ko and Tg lines and I am not familiar with the zinc transporter. My brief search showed that SLC39A1 expression seems to correlate with cancer. So as PLZF. So I don't think I can be a help.

Having said that, I have questions for you. You worked with U2OS human cells (I found out it is an osteosarcoma cell line). Since PLZF has different functions depending on the cell type, I am wondering if your finding is tumor cell specific or more generic. We worked with CD4 T cells and NKT cells and PLZF shows distinct functions even in T cell lineages For example, PLZF is required for NKT cell development but not for CD4 T cells. In addition, what is the morphological impact? If it is the change of the morphology of U2OS, do you see the same opposing effect in different cell types such as in non-tumor cells?

I guess the key question is (from an immunologist's perspective) whether the link between the two has functional consequences. Testing it using an in vivo model would be the way.

Best regards, Cheong-Hee"

AnneCarpenter commented 9 months ago

I started to look for alternatives: Albert Bandelac recently passed away at a ripe old age. Sent email to postdoc bmacnabb at Caltech next. Next options to try if needed (deproritized focus on sperm/uterus/malaria): https://reporter.nih.gov/search/AAbEtq4H8Uyo8KnZoZOIoQ/project-details/10752940 https://reporter.nih.gov/search/AAbEtq4H8Uyo8KnZoZOIoQ/project-details/10450153 https://reporter.nih.gov/search/AAbEtq4H8Uyo8KnZoZOIoQ/project-details/10579308

tjetkaARD commented 9 months ago

Unfortunately, the initial discovery comes from the file mentioned in: https://github.com/broadinstitute/2023_12_JUMP_data_only_vignettes/issues/7#issue-2044375145 The original Excel file is not filtered with respect to replicability of ORFs - hence propageted downstream.

In summary:

Therefore, it is only a not super-strong result from CRISPR, but not confirmed in ORF.

IMHO, should be discarded.

AnneCarpenter commented 9 months ago

Hi! Thanks for following this up and letting us know the ORF result is not existent (I think you're saying the genes don't "have a phenotype" in ORF data).

Still: overall we are very happy with vignettes with supporting data only in CRISPR or only in ORFs, because there are many valid biological reasons why one might get a result in one or the other but not both. So if the result is solid in CRISPR we can proceed unless I am missing something important.

tjetkaARD commented 9 months ago

To clarify I added additional details.

In my opinion this specific story is not very interesting, since:

Hence, my belief to discard.

AnneCarpenter commented 9 months ago

ok! If the researcher emails me I will figure out if there's something simple to see if this connection is real (only confirming the strongest hits is less convincing than if we also confirm a weaker one!) but otherwise will let it drop. Will close this issue and re-open if they do get back to me.

niranjchandrasekaran commented 5 months ago

connections

I am reopening this issue because

Given the weak cosine similarity, we can dismiss this connection, but I am keeping the issue open just in case we would like to report it in the paper.

niranjchandrasekaran commented 5 months ago

Notebook

The heatmap shows the percentile of the cosine similarities (1 → similar, 0 → anti-similar). The text is the maximum of the absolute KG score (gene_mf__go, gene_bp_go, gene_pathway). I set a KG threshold (like we previously had) of 0.4. If connections have a score lesser than this threshold, then the connection is considered to be unknown. The KG scores were downloaded from Google Drive: ORF and CRISPR. The diagonal of the heatmap indicates whether a gene has a phenotype (False could also mean the gene is not present in the dataset).

There is evidence in both ORF and CRISPR. This connection appears to be known.

ORF

ORF-connections-SLC39A1-ZBTB16

CRISPR

CRISPR-connections-SLC39A1-ZBTB16

niranjchandrasekaran commented 3 months ago

I checked the KG scores to find out why this connection is previously known. As far as I can see, these two genes share these four GO BP annotations.

GOBP_EMBRYO_DEVELOPMENT
GOBP_APPENDAGE_DEVELOPMENT
GOBP_EMBRYONIC_MORPHOGENESIS
GOBP_SKELETAL_SYSTEM_DEVELOPMENT  
AnneCarpenter commented 3 months ago

that's interesting - I guess it's only a 'known' connection because both are annotated as being involved in the same process but no direct relationship? If so, that would be interesting and sort of semi-unknown :)

niranjchandrasekaran commented 2 months ago

Notebook

This connection is not affected by plate layout.

ORF

ORF-plate-layout-SLC39A1-ZBTB16

CRISPR

CRISPR-plate-layout-SLC39A1-ZBTB16

AnneCarpenter commented 1 month ago

Prompt for plex if we need it "Carefully review all 20 top results across all categories. These results represent perturbations which result in modulation of similar processes to a perturbation of interest. Based on the biological processes that are most shared across all of these results what do you think are the most likely biological processes that are modulated by the perturbation of interest? Please be specific."

jgaetz-plex commented 1 month ago

A Plex search with the 52 most similar ORFs to this cluster does not give any strong hypotheses for a functional connection.

AnneCarpenter commented 1 month ago

Ok, thank you. This one is written into the paper as follows and we needn't do more research on it (although Jed, it's a bit odd your search didn't uncover what's shown below in your GO annotation - oh, I guess because you searched 52 genes rather than just the 2 genes!):

"Our analysis also uncovered a strong correlation in both ORF and CRISPR profiles between genes SLC39A1 and ZBTB16 (Figure 8 b and c). Notably, while this connection lacks direct evidence in literature, it is supported by orthogonal evidence in the form of common Gene Ontology annotations, including Embryo development, Appendage development, Embryonic morphogenesis and Skeletal system development."

jgaetz-plex commented 1 month ago

Right. With the 52 gene set there are some shared GO annotations, but nothing I would consider strongly linking this group of genes.

AnneCarpenter commented 1 month ago

Great. Is it silly to search Plex with just the two genes to see if there's more concrete links between the two? I was surprised the Knowledge graph score was so high for genes that only share GO terms, so I wonder if we missed some other data/literature out there.

jgaetz-plex commented 1 month ago

It's not silly; it could potentially identify some connection we aren't aware of. For the 2-gene search, we might be interested in results with 2 direct connections in any of the gene/protein set categories (Protein Domains, Motifs, Publications, etc). There are very few results where both genes are connected. There are a few Publications results, but they are all topically broad and include large numbers of genes, making the co-occurrence of these two genes not notable.

AnneCarpenter commented 1 month ago

ok, great! We can leave the text as-is in the main paper gdoc, then.