Closed justaddcoffee closed 4 years ago
This is an improvement overall. I think it would be best to offset the genotype LR into its own column for the sparklines, perhaps adjacent to the gene name with composite log column last? Either way there should be adequate horizontal real estate.
Also, I'm confused as to why there are not more datapoints in the sparklines below; implies that the sample patient only provided a single phenotype for the profile? (the second bar is for the gene). Ok if that is the case, but just making sure there isn't more to it.
Thanks @jmcmurry!
Also, I'm confused as to why there are not more datapoints in the sparklines below; implies that the sample patient only provided a single phenotype for the profile?
Yep, that's correct - this is a fake phenopacket containing only 1 HPO term. (I am just using this for testing.)
I think we need to have some feedback in this case where the phenotype profile is scant. The post test probability is very high for lots and lots of diseases. Can we have some kind of confidence bars on that or something @pnrobinson ? Barring that (snicker) let's at least have some message about 'your profile matched X diseases' to better discriminate between these, consider adding one or more of the following phenotypes...
@jmcmurry clearly the posttest probability is only an estimate and in there are many cases where the estimate is wrong. The idea of adding or removing phenotypes is a clear way of improving things but beyond the scope of this first publication (it is a research project on its own, e.g., the improved annotation sufficiency score). The statistical framework used does not provide a way to calculate confidence intervals. I think the best we can do is to provide documentation about use and limitations of the algorithm (I want to take a first stab at this by next week actually). In any case, for many of the actual phenopackets, the results actually are what one would want. Sometimes the correct diagnosis is up there but gets a low pt prob. I think there are additional algorithms to compensate for this, but again, that would be too much for the initial presentation of the algorithm and we are already at a state of art level!
@justaddcoffee @jmcmurry -- I like the new color scheme but one reason I went for blue instead of red was the issue about red-green color blindness. We should probably pick a color scheme that is as accessible as possible. Ideas?
Hadn't considered colorblindness. I think the red bars fail gracefully though. See for example here for how the page looks: https://www.color-blindness.com/coblis-color-blindness-simulator/
Worst case, for red blind people, the red and green both just appear as yellow and the signal we are trying to convey (i.e. the length and directions of the bars) is still conveyed..
Responding to two threads here:
Color: Overall I'm not too worried about the colors since the up down convey the same info. That said I do prefer the aesthetic of the original blue v red. What it doesn't convey as well as red/green is positive / negative.
Confidence intervals: Peter no prob; it is fine if we are not ready for sophisticated refinement, however, I still think it worthwhile to include some kind of actionable message like "Add more phenotypes here to improve the differential diagnosis". Thoughts?
"Add more phenotypes here to improve the differential diagnosis". -> The thing is, there may be cases when the diagnosis is correct even with one phenotype and so it really requires pretty sophisticate treatment. I think that LIRICAL would lend itself to a dynamic desktop app or more sophisticated web app that would allow users to work with the data in the way you are saying, but again that would be a few years of dev time plus would require algorithmic work...I think there is not much more we can do now for the first publication....
As to color: I definitely think that the grey diamonds are better. I am on the fence about red/blue vs red/green. I did choose the colors from a "cool" palette, and maybe we can go for these red/greens instead of the "pure" colors (https://nanx.me/ggsci/reference/pal_npg.html)
That said I do prefer the aesthetic of the original blue v red. What it doesn't convey as well as red/green is positive / negative.
That was my thinking - green/red provides easily digestible cues about supporting/contradicting evidence, whereas the meaning of blue/red is a little opaque. But, I defer to your design wisdom @jmcmurry!
I think there is not much more we can do now for the first publication....
Agreed it would take some work. While it would be great to detect this basement scenario, we don't even have to be that sophisticated for now. I would still recommend some standard message like:
"If there are several well-matched candidates, consider redoing the analysis with larger or more specific phenotype profile"
... or something to that effect. Would that be oK?
. I did choose the colors from a "cool" palette, and maybe we can go for these red/greens instead of the "pure" colors
Here is a plot with a "cool" red ("#e64b35ff") and green ("#00a087ff") from that site. Thoughts?
I am happy but would defer to Julie!
I am going to accept this pr now, and we can change the color in the future as needed
Great compromise on colors, all. Intuitive but not as garish. I still would recommend we stick the gene LR in its own col though.
Great compromise on colors, all. Intuitive but not as garish. I still would recommend we stick the gene LR in its own col though.
Here's a tiny PR for this https://github.com/TheJacksonLaboratory/LIRICAL/pull/478
@pnrobinson can you comment on this part pls?
"If there are several well-matched candidates, consider redoing the analysis with larger or more specific phenotype profile"
... or something to that effect. Would that be oK?
@jmcmurry I think that belongs in the tutorial. There will be cases when there is nothing more that can be done. ie. phenotype only analysis of say Fanconi anemia where there are lots of diseases that are clinically identical. Therefore, it is not good to hard bake this advice into the HTML.
A few suggested changes to the color scheme. Generally I've changed things so that colors in sparkline graphs and graphs for each disease match, and so that supporting evidence is green and contradicting evidence is red.
Specifically:
Some screenshots with these changes highlighted below. @pnrobinson @jmcmurry What do you guys think?