BIMSBbioinfo / pigx_rnaseq

Bulk RNA-seq Data Processing, Quality Control, and Downstream Analysis Pipeline
GNU General Public License v3.0
21 stars 12 forks source link

Diagnostic plots #36

Closed Nicolai-vKuegelgen closed 6 years ago

Nicolai-vKuegelgen commented 6 years ago

After some discussion with people from our group, we found some issues (or possible improvements) with the current way the diagnostic plots in the DEseq report are displayed:

1) PCA analysis The PCA plot doesn't show the percentage of variance for the plotted PCs: the axes are only labeled with PC1 / PC2. In order to properly interpret the plot it's necessary to know the relative importance of each PC so the axis labels should include the percentage of variance explained by that PC: e.g. PC1 [80%] / PC2 [15%]

2) MA plot The y-axis of the MA plot is cut at +-2, which - depending on the dataset - can leave quite a number of data points merged on that value. I guess it may be necessary to set a cutoff for the axis to prevent outliers from having a strong influence, but maybe the threshold can be a bit more dynamic (e.g. to capture at least 80-90% of the data).

3) Correlation plots In most RNAseq samples we expect the the replicates to be very similar, but even case vs control in the same cell type will have a relatively strong correlation. Therefore the a color axis from -1 to 1 will have almost no impact and color differences are almost not visible. To make the similarities or difforerences of the included sample more visible it might be good to change the colour scale or include the R² / correlation value for each pair in the plot.

Nicolai-vKuegelgen commented 6 years ago

Update this from only the PCA issue to include more feedback I've gotten from the lab

borauyar commented 6 years ago

Improve presentation of diagnostic plots: PCA, MA, corrplot