Closed George-Zipperlen closed 7 years ago
For reference, see the related PR https://github.com/cognoma/machine-learning/pull/91.
It is not clear to me what the TOTAL column means after the 3rd line does the divide operation, is it now some kind of average?
The Total
column indicates what percentage of the samples for a given disease have a mutation in any of the displayed genes. In other words, the total column is always guaranteed to be the max frequency for a given disease.
Thanks @dhimmel
Hi, @dhimmel and @gwaygenomics
I hope that this is the right place to ask a question about the code in 3.TCGA-MLexample_Pathway.ipynb
I'm working on converting the first heatmap: "percentage of different mutations across different cancer types" from seaborn to to Altair/vega-lite, continuing from the fine work of @superkostya.
I have figured out how to use different color maps in vega-lite, e.g. the viridis color map.
The 'TOTAL' column is not really a gene, and because it is a sum, it's values are much larger than the gene expression values, causing the differences between other values to be less apparent in the display.
I can move this column to the right of the chart with some slicing and dicing, but I'm not sure it really belongs.
here are the relevant lines from cognoma/machine-learning/3.TCGA-MLexample_Pathway.ipynb which create the 'TOTAL' column:
It is not clear to me what the TOTAL column means after the 3rd line does the divide operation, is it now some kind of average?
Thanks for any clarification.