monarch-initiative / monarch-legacy

Monarch web application and API
BSD 3-Clause "New" or "Revised" License
42 stars 37 forks source link

3d data viz exploration #1443

Open jmcmurry opened 7 years ago

jmcmurry commented 7 years ago

Hollis (@wrighth) is working on 3d visualization tooling for networks. He is looking for potential datasets so Shannon McWeeney sent him our way. Attached is an example of the format would needs the data in. @kshefchek do you want to have a think about a subset that might be interesting? Hollis will send more info but the example files and readme are here.

wrighth commented 7 years ago

Hello all, Wanted to follow up Julie here with a simple example of visualizing binding sites that came up in the chat I had with her and Melissa regarding the tool that I think is a good illustration of what it can do. For this example I've pulled a number of ADHD-related SNPs from the GWAS table of the UCSC browser and laid them onto gene models and intergenic regions. The SNPs are color-coded by risk subtype: hyperactive on dark blue/"top" of model, inattentive in green and combined type in magenta. I've also added tracks from the TFBS UCSC track for three factors with high-scoring sites near these regions: CEBPB in yellow, ZBTB7A in light blue and USF1 in pink, with height representing the UCSC "score" variable for the feature. I've also included a set of ChIA-PET Pol I DNA-DNA interactions pulled down from K562 cells that are visualized as the white pipes between the centers of the interacting regions.

The attached screenshots show a head-on "network view", and a side/detail view of a potential enhancer-like region that interacts with the MAPRE gene and a detail of the MAPRE-DNMT3B region that has a number of ChIA-PET interactions. While I think it's more clear when you can play with the visualization live, it is hopefully obvious from the screenshots that:

a) Most of the SNPs near the ChIA-PET interactions are associated with hyperactive or combined types. b) Few of the peaks for any of the TFBS regions are near inattentive-associated SNPs, but hyperactive/combined SNPs are often near CEBPB sites in particular.

c) Some hyperactive-associated SNPs are also combined-associated SNPs (as they appear on both tracks)

A lot of this information is represented in the zoomed-in detail of MAPRE-DNMT3B, for example.

I'm happy to answer any questions and I can send the raw data files around if those are of interest.

adhd_risk_snps_tfbs_binding_chia_pet_network_view adhd_risk_snps_tfbs_binding_chia_pet_enhancer_view adhd_risk_snps_tfbs_binding_chia_pet_mapre_dmnt3b_view

pnrobinson commented 7 years ago

I would be really interested in hearing more about this. We are currently writing code for the analysis of capture Hi C data (not a part of Monarch, but potentially relevant), and are looking to correlate the patterns of reads with variants and looping. I think it would also be useful to integrate a lot more of this information. I do not really understand what the figure is showing -- but is this available online anywhere?

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: wrighth notifications@github.com Sent: Tuesday, April 25, 2017 4:07 PM To: monarch-initiative/monarch-app Cc: Subscribed Subject: Re: [monarch-initiative/monarch-app] 3d data viz exploration (#1443)

Hello all, Wanted to follow up Julie here with a simple example of visualizing binding sites that came up in the chat I had with her and Melissa regarding the tool that I think is a good illustration of what it can do. For this example I've pulled a number of ADHD-related SNPs from the GWAS table of the UCSC browser and laid them onto gene models and intergenic regions. The SNPs are color-coded by risk subtype: hyperactive on dark blue/"top" of model, inattentive in green and combined type in magenta. I've also added tracks from the TFBS UCSC track for three factors with high-scoring sites near these regions: CEBPB in yellow, ZBTB7A in light blue and USF1 in pink, with height representing the UCSC "score" variable for the feature. I've also included a set of ChIA-PET Pol I DNA-DNA interactions pulled down from K562 cells that are visualized as the white pipes between the centers of the interacting regions.

The attached screenshots show a head-on "network view", and a side/detail view of a potential enhancer-like region that interacts with the MAPRE gene and a detail of the MAPRE-DNMT3B region that has a number of ChIA-PET interactions. While I think it's more clear when you can play with the visualization live, it is hopefully obvious from the screenshots that:

a) Most of the SNPs near the ChIA-PET interactions are associated with hyperactive or combined types. b) Few of the peaks for any of the TFBS regions are near inattentive-associated SNPs, but hyperactive/combined SNPs are often near CEBPB sites in particular.

c) Some hyperactive-associated SNPs are also combined-associated SNPs (as they appear on both tracks)

A lot of this information is represented in the zoomed-in detail of MAPRE-DNMT3B, for example.

I'm happy to answer any questions and I can send the raw data files around if those are of interest.

[adhd_risk_snps_tfbs_binding_chia_pet_network_view]https://cloud.githubusercontent.com/assets/5316230/25405409/156124ae-29b8-11e7-8f27-31a22df70c97.png [adhd_risk_snps_tfbs_binding_chia_pet_enhancer_view]https://cloud.githubusercontent.com/assets/5316230/25405428/1a6b0a5a-29b8-11e7-8fc1-0a660a7fc985.png [adhd_risk_snps_tfbs_binding_chia_pet_mapre_dmnt3b_view]https://cloud.githubusercontent.com/assets/5316230/25405433/1d6cf95c-29b8-11e7-8ea1-d3d7ed66704f.png

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/monarch-initiative/monarch-app/issues/1443#issuecomment-297149546, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEtuPPG0mhfSzwLGWO8dr4iUhsQ-tDZLks5rzlKOgaJpZM4M-H3E.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

wrighth commented 7 years ago

Chromatin looping in general is one of the major use cases I envisioned during the initial development, in fact! Right now the tool's not something I can distribute outside of OHSU since it's being evaluated by the university's Tech Transfer department for licensing, but if you happened to have a sample of HiC interaction regions you'd be interested in visualizing I'd be happy to run it through and see what it looks like. I can also talk with Tech Transfer and see if there's something we could work out as well.

kshefchek commented 7 years ago

@wrighth We're hoping to incorporate more data on regulatory regions (from the Ensembl regulatory db, jaspar), and also more network and interaction data. I can't say for sure when this will be added, but I can add you as a watcher on some of these tickets if this would be of interest.

wrighth commented 7 years ago

@kshefchek Please do add me on those tickets, thanks!

pnrobinson commented 7 years ago

Thanks! But why don't we wait until you have figured out licensing issues. We are developing some new ideas that will be open source if they work out, and I am concerned about "polluting" our project with licensing issues at this point.

-Peter

Peter Robinson

Professor of Computational Biology

The Jackson Laboratory for Genomic Medicine

10 Discovery Drive

Farmington, CT 06032

860.837.2095 t | 860.990.3130 m

peter.robinson@jax.orgmailto:peter.robinson@jax.org

www.jax.org

Robinson lab: https://robinsongroup.github.io/

The Jackson Laboratory: Leading the search for tomorrow's cures


From: wrighth notifications@github.com Sent: Tuesday, April 25, 2017 5:06 PM To: monarch-initiative/monarch-app Cc: Peter Robinson; Comment Subject: Re: [monarch-initiative/monarch-app] 3d data viz exploration (#1443)

Chromatin looping in general is one of the major use cases I envisioned during the initial development, in fact! Right now the tool's not something I can distribute outside of OHSU since it's being evaluated by the university's Tech Transfer department for licensing, but if you happened to have a sample of HiC interaction regions you'd be interested in visualizing I'd be happy to run it through and see what it looks like. I can also talk with Tech Transfer and see if there's something we could work out as well.

- You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/monarch-initiative/monarch-app/issues/1443#issuecomment-297164889, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEtuPKUvQV_6KtykfoaIRIdFOmRFGlAwks5rzmBBgaJpZM4M-H3E.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

jmcmurry commented 7 years ago

+1

wrighth commented 7 years ago

@pnrobinson Depending on how you're planning on licensing your project what I've got might not be directly usable in any case, unfortunately; the underlying Unreal Engine I'm using has its own license that is BSD-style license compatible but not GPL-style compatible, so any of the "viral" licenses is out. That said, I don't think running data through is itself problematic so still happy to do that.

pnrobinson commented 7 years ago

Didn’t mean to sound snotty, but at this point I am merely curious and would love to try out a web version of this visualization system whenever it becomes available!