Closed nlwashington closed 8 years ago
also, i think we should leave the door open for data overlays, so that (soon) when we want to add the "not" phenotypes onto the grid (as in, if a matching disease scores well, but one of the phenotypes is definitively not shared), it could be indicated by say a bright red color!
Agreed. I will try to gather many of these design suggestions into a list of topics to consider for a 2.0 release..
Also, I think that the currently configured color ranges are actually giving the user false information. Because the scales have overlapping color ranges, it means that some scores that are identical have different colors, and some scores that have different values have the same color. This is really confusing when viewing all the data together. I think all the colors should be removed and only one scale should be used. Grayscale is sufficient, I think.
Ok. are you against multiple disjoint color scales?
On 26 Mar 2015, at 3:19 PM, Nicole Washington notifications@github.com wrote:
Also, I think that the currently configured color ranges are actually giving the user false information. Because the scales have overlapping color ranges, it means that some scores that are identical have different colors, and some scores that have different values have the same color. This is really confusing when viewing all the data together. I think all the colors should be removed and only one scale should be used. Grayscale is sufficient, I think.
— Reply to this email directly or view it on GitHub.
Harry Hochheiser University of Pittsburgh Department of Biomedical Informatics harryh@pitt.edu 412 648 9300
i am wishy-washy about disjoint color scales. if you really think disjoint color scales is the way to go, then perhaps only one color per species. but i think i am against it, because color doesn't convey enough extra information, and i think we should reserve it for conveying something else (i have ideas about this). not to mention that once we have > 7 species, it isn't tractable.
Ok. let’s talk about color use next week, perhaps..
On 26 Mar 2015, at 4:54 PM, Nicole Washington notifications@github.com wrote:
i am wishy-washy about disjoint color scales. if you really think disjoint color scales is the way to go, then perhaps only one color per species. but i think i am against it, because color doesn't convey enough extra information, and i think we should reserve it for conveying something else (i have ideas about this). not to mention that once we have > 7 species, it isn't tractable.
— Reply to this email directly or view it on GitHub.
Harry Hochheiser University of Pittsburgh Department of Biomedical Informatics harryh@pitt.edu 412 648 9300
here is an example of issues with overlapping color scales across taxa giving false and/or confusing information.
Nicole, what else would we use color for if not for similiarity?
here, i've mocked up an example so you can play with it in codepen. (not sure how stable it is since i don't have an acct.) here's screenshots:
in this example, genotypes/genes/diseases are columns and phenotypes are rows as in our standard phenogrid. rare phenotypes (high IC) that are closely related in each intersection point show up as big dark circles, whereas common phenotypes (low IC) that are not closely related are faint small circles. when you hover over the phenotype (as in the 2nd pic), you see the two numbers that go to populate the color and size (similarity/IC). the mockup has max=10 for each. there is one color scale.
you can play with the original code at this codepen
Nice! Not sure about showing the two numbers, maybe TMI to take in for an entire row? But overall I like the sizing and coloring
yeah, i don't think we should show both numbers to users... that was only here so you guys could see the combinations of values that went to produce the color and size. we can think about what might be the most interesting rollover value to show...it is kindof nice to see numbers of some kind, so long as we can explain them well.
This is definitely something that merits some discussion. I like the redundant use of both size and color for the IC/similarity content.
I have two concerns with the sizing.
I'll see if I can put something together this week to try the idea. Maybe it will work out...
the code doesn't have redundancy... the size and color saturation indicate different things:
color saturation = amount of similarity in the phenotypes between the row/column. for example, if both the query and target are both annotated to the exact term "abnormality of the head", they would have a 100% similar phenotype, whereas if it was small eye (query) vs cloudy eye (target), their term in common might be "abnormal eye", and their similarity might be 70%.
size = IC of the term in common. sometimes the term in common is a very rare node (so it's IC is ~ maxIC), but it could be a very commonly annotated term like "abnormality of the nervous system", which would be closer to minIC.
so, you could have a very commonly annotated term (like abnormality of the nervous system) with a very low IC be annotated exactly between the row/col... thus you would have a large (highly similar) but lightly colored (small IC) circle. on the other hand, you might have a very rare term in common (high IC) and originally annotated terms that are a fair distance apart (low similarity), and thus you would have a small darkly colored circle.
i would agree, though, that for this to work we'd have to increase the size of the cells/grid. i don't think that's such a bad option, particularly if there's some fancier scrolling and/or zooming that's also implemented. i would not be disappointed if we pitched the fixed-size grid.
i would also agree, that we'd have to test out different scaling methods... i only tried linear for this quick & dirty example, but i think we should try out others.
@nlwashington, thanks. I didn't read that closely regarding the coding. I think it might be hard for users to understand the similarity/IC distinction, but we can try
understood also about the size of the cells/grid. I think it's always good to start with as much data as possible, but we could consider zoom-in. It's also possible to start with fixed size and to vary as the amount of data gets smaller.
alternative scaling is relatively easy to implement...
Noticed the Github color scheme.
can we please move to a single color scale?
+1 for a single color. (Eg shades of blue) or for standard mat lab heat map colors, as they're well understood.
I still think we need multiple colors for the multi-organism overview.
i do not think we need different colors for each species. for the reasons above, i think one color is sufficient. the species are visually separated by the thick line and labeled in the header, so color does not provide additional information. it is pretty, but i don't think the gee-wiz factor overrides accuracy and functionality in the presentation. i am okay with leaving this as a configurable option for any grid installation (as in don't get rid of the generic code), but for the monarch usage of it on our pages, i think it should not be used. i'm sure that @mellybelly and @cmungall also have opinions on this.
Other than the overlap, what is the objection to the multiple colors? If we use three disjoint scales, would that address the problem? @yuanzhou, @davism84, can you look into an alternative range for the third scale?
I'm color illiterate but bearing that in mind:
There are more than 3 species out there; and other ways of grouping result sets (e.g human canonical patients/diseases vs actual patients). We will start adding more soon. I'm not sure what the algorithm is for n species/groupings.
Sometime I get confused as to how similar phenogrid thinks the match is to the query. I think having a consistent color that is the same across all species would help with this.
But +1 to configurability
Agreed on configurability, @yuanzhou, and @midavis, please note as a needed configuration.
So, as I understand it, the concern is comparing the degree of similarity across species? That’s a fair concern..I hate to lose the appealing color, but if we are confusing people, that’s not good.
@yuanzhou, @midavis, let’s discuss on Monday.
On Jul 25, 2015, at 12:21 AM, Chris Mungall notifications@github.com<mailto:notifications@github.com> wrote:
I'm color illiterate but bearing that in mind:
There are more than 3 species out there; and other ways of grouping result sets (e.g human canonical patients/diseases vs actual patients). We will start adding more soon. I'm not sure what the algorithm is for n species/groupings.
Sometime I get confused as to how similar phenogrid thinks the match is to the query. I think having a consistent color that is the same across all species would help with this.
But +1 to configurability
— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/phenogrid/issues/23#issuecomment-124795205.
Harry Hochheiser University of Pittsburgh Department of Biomedical Informatics harryh@pitt.edumailto:harryh@pitt.edu 412 648 9300
@yuanzhou, can we consider going to one color scheme for all organisms and then using a subtle colored background for the organisms in the multi-organism view?
Updated to one color scale.
Will need to do more research to decide the final number of data classes and the actual color scheme.
Now the phenogrid looks like this:
Updated to the blue scale based on the feedback from meeting, also got rid of the lightest color to improve readability.
New screenshots for quick look:
having multiple colors, one for each species, is not a tractable solution, and the color in and of itself doesn't really convey any meaning. although initially pretty in the overview, it seems a little gimmicky to me.
maybe we should consider moving to a simple greyscale, or perhaps consider using color to indicate category? (somewhat related to https://github.com/monarch-initiative/monarch-app/issues/449). the difficulty might be when terms fall into multiple categories, but perhaps a split-colored cell can be used for those. (i feel like that's an issue we can take up down the line, maybe the user could eventually be allowed to choose the color for those cells/choose the category with which to group a term.)