Open skanwal opened 1 month ago
I afraid that it's more complex than that. Both, the plots ad the tables, present median values of Z-scores calculated for individual groups/patient. The key functions to look at are:
exprGroupsStats_geneWise.R ( https://github.com/umccr/RNAsum/blob/main/R/exprGroupsStats_geneWise.R ):
exprTable.R ( https://github.com/umccr/RNAsum/blob/main/R/exprTable.R ):
cdfPlot.R ( https://github.com/umccr/RNAsum/blob/main/R/cdfPlot.R ):
I feel that the plots requires to provide values in the context of the entire cohort while the table provide stats (median values) for the group/patient.
Re the table legend it could be mentioned that the values refer to MEDIAN Z-score (or percentile) in the reference cohort and patient, e.g. for BRCA case in the Z-score tab it could look like (changes/additions in italics font):
In the BRCA (TCGA), Patient and the Diff columns the RED colour range indicate relatively high expression (median Z-score) values and BLUE colour range indicate relatively low expression (median Z-score) values in individual sample group. The BLANK cells with missing values indicate genes with no/low expression. The Diff (Patient vs BRCA (TCGA)) column illustrates the difference between median Z-scores in patient sample and reference cancer cohort for each mutated gene...
Expand legend to clarify plots are using median values - because box plots describe medians. This is as opposed to tables which are using mean.
Mean will be different from median for genes that have low expression across the cohort.