Open adamabeshouse opened 3 years ago
A couple of product considerations @cBioPortal/product: 1) What is the best way to plot a sample attribute vs a patient attribute? Would it make sense to plot values by sample and "distribute" a patient value to all of its samples? This is how we do it for clinical tracks in the oncoprint
2) For scatter plots, we have simplified by using density. For box plots, this is a little trickier, because each "box" is really just 1-dimensional, not 2-dimensional - making it 2 dimensional is just using random jitter to make it look nicer. What's the best way to handle this when simplifying into a heatmap?
Hi @adamabeshouse I agree with your solution for 1. It is straightforward and always works. I am not sure I understand your second question. We are trying to bring the plots tab into the study view, right? So why are we not just displaying a box plot when a user selects a categorical and a numerical variable? What is different here compared to the plots tab?
@Sjoerd-van-Hagen we decided with the FGA vs mutation count plot to use a density plot aka 2d histogram because the smallness of the chart means the full size points become impossible to read and interact with for large studies. So that's my question - how do we handle a box plot given the smallness of the chart?
@adamabeshouse would a smaller box plot not work? We do not have to draw the individual points if it is too small. When the user scales it up to a certain point we can start showing them. Does that make sense?
@Sjoerd-van-Hagen this is exactly my question - how do we not draw the individual points in a box plot? maybe a violin plot or something like that? or a 1-dimensional heatmap that is thickened for easier visibility?
I think I would just draw the boxplot without drawing the points in the initial version but perhaps the rest of @cBioPortal/product have a stronger preference for another solution?
Thats a good point, I hadnt thought of that
Good idea. But it would be nice if there was an option to enable the individual points...
Stemming from discussion - users will be able to select any 2 data sources and plot them against each other. These plots will be interactable in the same way as other study view charts: for filtering the cohort interactively.
[x] The first step is to expand Mutation Count vs FGA to arbitrary numerical clinical vs numerical clinical attributes. https://github.com/cbioportal/cbioportal/issues/8867
[ ] From there, we'll build out to categorical clinical vs numerical clinical (violin plot table). https://github.com/cBioPortal/icebox/issues/307
[ ] From there, we can do categorical clinical vs categorical clinical (stacked bar).
Finally, from there we can start to build out into other data types. (TBD)