kevinblighe / EnhancedVolcano

Publication-ready volcano plots with enhanced colouring and labeling
399 stars 81 forks source link

Error when overrided shape scheme with custom key-value pairs + Problem of overriding color of data points #79

Closed chainorato closed 3 years ago

chainorato commented 3 years ago

Hi,

Thank you for developing a useful package.

There was the following error

Error: Discrete value supplied to continuous scale

when I try to override the shape of 2 data points. The color for plotting was a gradient.

My codes for plotting are.

keyvals.shape <- ifelse(rownames(res)=="A", 17, ifelse(rownames(res)=="B", 18, 16))
keyvals.shape[is.na(keyvals.shape)] <- 16
names(keyvals.shape)[keyvals.shape==16] <- 16
names(keyvals.shape)[keyvals.shape==17] <- 17
names(keyvals.shape)[keyvals.shape==18] <- 18
EnhancedVolcano(res, lab=lab_italics, x='log2FoldChange', 
               y='padj', title='XXX', pCutoff=5e-2, FCcutoff=0,
               labSize=5, shapeCustom=keyvals.shape, colCustom=NULL,
               pointSize=4, colGradient=c('red3','royalblue'), colAlpha=0.5,
               selectLab=select_label, drawConnectors=TRUE, widthConnectors=0.5,
               parseLabels=TRUE, boxedLabels=FALSE, arrowheads=TRUE,
               legendPosition='right', maxoverlapsConnectors=Inf) + coord_flip()

I think the error happens within the following code of EnhancedVolcano

} else if (is.null(colCustom) & !is.null(shapeCustom)) { if (is.null(colGradient)) { plot <- ggplot(toptable, aes(x = xvals, y = -log10(yvals))) + th + guides(colour = guide_legend(order = 1, override.aes = list(size = legendIconSize)), shape = guide_legend(order = 2, override.aes = list(size = legendIconSize))) + geom_point(aes(color = Sig, shape = factor(names(shapeCustom))), alpha = colAlpha, size = pointSize, na.rm = TRUE) + scale_color_manual(values = c(NS = col[1], FC = col[2], P = col[3], FC_P = col[4]), labels = c(NS = legendLabels[1], FC = legendLabels[2], P = legendLabels[3], FC_P = legendLabels[4]), drop = legendDropLevels) + scale_shape_manual(values = shapeCustom) } else { plot <- ggplot(toptable, aes(x = xvals, y = -log10(yvals))) + th + guides(shape = guide_legend(order = 2, override.aes = list(size = legendIconSize))) + geom_point(aes(color = Sig, shape = factor(names(shapeCustom))), alpha = colAlpha, size = pointSize, na.rm = TRUE) + scale_colour_gradient(low = colGradient[1], high = colGradient[2], limits = colGradientLimits, breaks = colGradientBreaks, labels = colGradientLabels) scale_shape_manual(values = shapeCustom) }

The variable factor(names(shapeCustom)) is not in continuous scale. (is.numeric(keyvals.shape) returned FALSE).

Here are the plot and the code when I did not override shapes of data points. Black marks are regions of those two genes I would like to highlight.

EnhancedVolcano(res, lab=lab_italics, x='log2FoldChange', y='padj', title='XXX', pCutoff=5e-2, FCcutoff=0, labSize=5, colCustom=NULL, pointSize=4, colGradient=c('red3','royalblue'), colAlpha=0.5, selectLab=select_label, drawConnectors=TRUE, widthConnectors=0.5, parseLabels=TRUE, boxedLabels=FALSE, arrowheads=TRUE, legendPosition='right', maxoverlapsConnectors=Inf) + coord_flip()

Screen Shot 2021-05-20 at 18 55 12

Anyway, instead of overriding just the shapes of those two data points, I have thought about changing their colors because the plot is dense and probably they will still be invisible even the shapes are changed. However, I think colCustom would override the color gradient. Shading and encircling would be weird because both genes are unrelated. And I also found that arrows were not generated for both data points.

Are there any ways to change both shapes and colors of those points and put them in front of all other data points? Any suggestions would be appreciated. Thank you in advance.

kevinblighe commented 3 years ago

Hi, this problem disappears when you don't activate colGradient=c('red3','royalblue')? The color gradient functionality needs extensive testing, but I am aware that not many people are using it. There are many combinations of shape, color, etc.

If you want to ensure that certain genes are visible, then you should re-arrange your input data and put these genes at the end (the last rows) of the input data-frame. This means that they will be plot last.

There is also the encircle functionality, but I am not 100% content with it so far: https://github.com/kevinblighe/EnhancedVolcano#encircle--highlight-certain-variables