blekhmanlab / compendium_website

Website for the Human Microbiome Compendium
http://microbiomap.org/
0 stars 1 forks source link

Network visualization #8

Closed vincerubinetti closed 1 year ago

vincerubinetti commented 1 year ago

Sean brought up the idea of having a network visualization if appropriate.

Examples/inspiration: https://het.io/search/?source=20441&target=36436&metapaths=GeAeGpPW https://adage.greenelab.com/genes?model=1&genes=290

We can have things like search boxes for filtering, sliders for thresholds, clicking on nodes to expand/explode/get more info, etc. The only limits to be cautious of are # of nodes shown on the screen. Probably want to limit that to at most 100, both for browser performance and readability.

Sketches or screenshots, notated with ideas for interactive features, would be helpful for me to understand what we want to implement. A preview of the data structure you would provide me would also help.

rabdill commented 1 year ago

I would guess our network viz would be less complicated than the examples above, if only to avoid people doing actual data analysis on the website for now. @spgraham1 may have more details, but I believe the general idea is that each node would represent a single taxon, and each edge is defined by how many samples contain both taxa.

If we incorporate region, we could use the big map to filter the network. That data could look like this:

taxon1 taxon2 region samples
Prevotella Phytobacter Europe and Northern America 82123
Prevotella Phytobacter Centeral and Southern Asia 12
Prevotella Lactobacillus Europe and Northern America 12999
etc etc
spgraham1 commented 1 year ago

This is pretty much what we'd have, but the edge weight would be defined by the correlation between the two taxa, rather than the number of samples that contain that edge. I'd probably also include some info like relative abundance, prevalence, and family for each taxon that could be used to add more detail to the network. Here's an example of what we were originally thinking for the networks:

network_example

In this version, each node is a taxon, and the edge indicates a significant correlation between two taxa. The edge weight corresponds to correlation strength. The size of the node indicates the prevalence of the taxon. The color of the node indicates the family of the taxon. I'll add some annotations of potential ways the user can interact, but this is what we were thinking for a static version

cgreene commented 1 year ago

Defer from v1.0 per 8/1 meeting.

vincerubinetti commented 1 year ago

I think we can leave this open. It has an enhancement tag, and I could do it quite soon and quickly if someone gives me the data. Probably after the paper folks are done crunching for the paper release.

Unless people really don't want to do this on the website at all, ever. Maybe doing it in R (Sean's linked issue) is enough.

spgraham1 commented 1 year ago

I think we intentionally want to leave this off the website for now. My understanding (please correct if I'm wrong) is that this could be a big analysis further down the line, so we might not want to put it on the website for now.

vincerubinetti commented 1 year ago

Speaking again today, in line with #5 , adding more features to the website than we have now might start to suggest to visitors that the website is a full and complete view of the data (when it is not), and thus keep them from downloading the full dataset/package.

I still like the idea of having a network viz because it is fun for me to implement (and perhaps a little harder to do in an aesthetically pleasing way with R?), so maybe we can still keep this idea on the table if we think of a way to make it very clear to the user that they should still download the R package for full analysis.

My gut tells me that we won't want to be adding this though, so closing for now.