Coordinate and tweak colors for likelihood ratios

justaddcoffee commented 4 years ago

A few suggested changes to the color scheme. Generally I've changed things so that colors in sparkline graphs and graphs for each disease match, and so that supporting evidence is green and contradicting evidence is red.

Specifically:

changed all LRs supporting diagnosis to green, and all LRs contradicting diagnosis to red (these previously were blue and red respectively)
removed border around the rect for LRs of phenotypes but preserved border for genotypes, to provide a visual cue that these are different LRs
changed color of diamond icon for excluded items that neither support nor contradict diagnosis to grey
changed color of overall disease posterior probability to green (from orange)

Some screenshots with these changes highlighted below. @pnrobinson @jmcmurry What do you guys think?

Screen Shot 2019-11-13 at 12 11 41 PM

jmcmurry commented 4 years ago

This is an improvement overall. I think it would be best to offset the genotype LR into its own column for the sparklines, perhaps adjacent to the gene name with composite log column last? Either way there should be adequate horizontal real estate.

Also, I'm confused as to why there are not more datapoints in the sparklines below; implies that the sample patient only provided a single phenotype for the profile? (the second bar is for the gene). Ok if that is the case, but just making sure there isn't more to it.

justaddcoffee commented 4 years ago

Thanks @jmcmurry!

Also, I'm confused as to why there are not more datapoints in the sparklines below; implies that the sample patient only provided a single phenotype for the profile?

Yep, that's correct - this is a fake phenopacket containing only 1 HPO term. (I am just using this for testing.)

jmcmurry commented 4 years ago

I think we need to have some feedback in this case where the phenotype profile is scant. The post test probability is very high for lots and lots of diseases. Can we have some kind of confidence bars on that or something @pnrobinson ? Barring that (snicker) let's at least have some message about 'your profile matched X diseases' to better discriminate between these, consider adding one or more of the following phenotypes...

pnrobinson commented 4 years ago

@jmcmurry clearly the posttest probability is only an estimate and in there are many cases where the estimate is wrong. The idea of adding or removing phenotypes is a clear way of improving things but beyond the scope of this first publication (it is a research project on its own, e.g., the improved annotation sufficiency score). The statistical framework used does not provide a way to calculate confidence intervals. I think the best we can do is to provide documentation about use and limitations of the algorithm (I want to take a first stab at this by next week actually). In any case, for many of the actual phenopackets, the results actually are what one would want. Sometimes the correct diagnosis is up there but gets a low pt prob. I think there are additional algorithms to compensate for this, but again, that would be too much for the initial presentation of the algorithm and we are already at a state of art level!

pnrobinson commented 4 years ago

@justaddcoffee @jmcmurry -- I like the new color scheme but one reason I went for blue instead of red was the issue about red-green color blindness. We should probably pick a color scheme that is as accessible as possible. Ideas?

justaddcoffee commented 4 years ago

Hadn't considered colorblindness. I think the red bars fail gracefully though. See for example here for how the page looks: https://www.color-blindness.com/coblis-color-blindness-simulator/

Worst case, for red blind people, the red and green both just appear as yellow and the signal we are trying to convey (i.e. the length and directions of the bars) is still conveyed..

Screen Shot 2019-11-13 at 3 29 14 PM

jmcmurry commented 4 years ago

Responding to two threads here:

Color: Overall I'm not too worried about the colors since the up down convey the same info. That said I do prefer the aesthetic of the original blue v red. What it doesn't convey as well as red/green is positive / negative.
Confidence intervals: Peter no prob; it is fine if we are not ready for sophisticated refinement, however, I still think it worthwhile to include some kind of actionable message like "Add more phenotypes here to improve the differential diagnosis". Thoughts?

pnrobinson commented 4 years ago

"Add more phenotypes here to improve the differential diagnosis". -> The thing is, there may be cases when the diagnosis is correct even with one phenotype and so it really requires pretty sophisticate treatment. I think that LIRICAL would lend itself to a dynamic desktop app or more sophisticated web app that would allow users to work with the data in the way you are saying, but again that would be a few years of dev time plus would require algorithmic work...I think there is not much more we can do now for the first publication....

pnrobinson commented 4 years ago

As to color: I definitely think that the grey diamonds are better. I am on the fence about red/blue vs red/green. I did choose the colors from a "cool" palette, and maybe we can go for these red/greens instead of the "pure" colors (https://nanx.me/ggsci/reference/pal_npg.html)

justaddcoffee commented 4 years ago

That said I do prefer the aesthetic of the original blue v red. What it doesn't convey as well as red/green is positive / negative.

That was my thinking - green/red provides easily digestible cues about supporting/contradicting evidence, whereas the meaning of blue/red is a little opaque. But, I defer to your design wisdom @jmcmurry!

jmcmurry commented 4 years ago

I think there is not much more we can do now for the first publication....

Agreed it would take some work. While it would be great to detect this basement scenario, we don't even have to be that sophisticated for now. I would still recommend some standard message like:

"If there are several well-matched candidates, consider redoing the analysis with larger or more specific phenotype profile"

... or something to that effect. Would that be oK?

justaddcoffee commented 4 years ago

. I did choose the colors from a "cool" palette, and maybe we can go for these red/greens instead of the "pure" colors

Here is a plot with a "cool" red ("#e64b35ff") and green ("#00a087ff") from that site. Thoughts?

Screen Shot 2019-11-13 at 4 12 57 PM

pnrobinson commented 4 years ago

I am happy but would defer to Julie!

pnrobinson commented 4 years ago

I am going to accept this pr now, and we can change the color in the future as needed

jmcmurry commented 4 years ago

Great compromise on colors, all. Intuitive but not as garish. I still would recommend we stick the gene LR in its own col though.

justaddcoffee commented 4 years ago

Great compromise on colors, all. Intuitive but not as garish. I still would recommend we stick the gene LR in its own col though.

Here's a tiny PR for this https://github.com/TheJacksonLaboratory/LIRICAL/pull/478

jmcmurry commented 4 years ago

@pnrobinson can you comment on this part pls?

"If there are several well-matched candidates, consider redoing the analysis with larger or more specific phenotype profile"

... or something to that effect. Would that be oK?

pnrobinson commented 4 years ago

@jmcmurry I think that belongs in the tutorial. There will be cases when there is nothing more that can be done. ie. phenotype only analysis of say Fanconi anemia where there are lots of diseases that are clinically identical. Therefore, it is not good to hard bake this advice into the HTML.

TheJacksonLaboratory / LIRICAL

Coordinate and tweak colors for likelihood ratios #472