FloWuenne / scFunctions

Functions for single cell data analysis
23 stars 9 forks source link

Volcanoplot from DE_Seurat #15

Closed levinhein closed 3 years ago

levinhein commented 3 years ago

Hello. I'm interested in performing the Volcanoplot from https://rdrr.io/github/FloWuenne/scFunctions/src/R/DE_Seurat.R. In the script, it was a comment scripted as follows:

  # ## Plot volcano plot
  # volcano_plot <- ggplot(this_cluster_de_genes,aes(logFC,-log(PValue))) +
  #   geom_point(size=2) +
  #   geom_point(data = subset(res_table,(logFC < neg_log_FC_thresh) & (PValue < q_value_thresh)),col="red") +
  #   geom_point(data = subset(res_table,(logFC > pos_log_FC_thresh ) & (PValue < q_value_thresh) ),col="green") +
  #   geom_text_repel(
  #     data = subset(res_table, (logFC > pos_log_FC_thresh | logFC < neg_log_FC_thresh) & (PValue < q_value_thresh)),
  #     aes(label = subset(res_table, (logFC > pos_log_FC_thresh | logFC < neg_log_FC_thresh) & (PValue < q_value_thresh))$gene),
  #     size = 5,
  #     box.padding = unit(0.35, "lines"),
  #     point.padding = unit(0.3, "lines")
  #   ) +
  #   ggtitle(res$comparison)+
  #   theme_light()
  #
  # volcano_plot
  # ggsave(volcano_plot,file=paste("../DE_edgeR/cluster-",this_cluster,"_volcano_plot.svg",sep=""))

Followed the whole script and at the Volcanoplot part, I removed the hashtags to perform it. When I did, it errored looking for several things like "res_table", "logFC", "neg_log_FC_thresh" variables and etc, but they are first mentioned on this part only and it's nowhere to be found. Seurat FindMarkers() outputs DEGs where the rows are genes and the columns are "p_val", "ave_log2FC", "pct.1", "pct.2", "p_val_adj".

May I ask where should the needed variables come from so that I can create the volcano plot of the DEGs? Thank you!

FloWuenne commented 3 years ago

Hi @levinhein,

I haven't been using this function in a while so would be good to have a quick peek at your data to be able to help you. The fact that it is commented out also means that I likely didn't update the variable names etc.

Intuitively from reading my code, I think you just need to change res_table to this_cluster_de_genes and it might work. But I believe the colnames of the Seurat FindMarkers function might have changed over time as well...

Could you show the first few lines (head()) of your table you are using for this_cluster_de_genes please? That way I should be able to help you get a volcano plot for your data!

Thanks!

levinhein commented 3 years ago

Hello, thank you for your reply!

Here's the head(KD.CD4T) of the DEGs (that I use for this_cluster_de_genes) obtained from Seurat FindMarkers(). Thank you!

image

FloWuenne commented 3 years ago

Ok, so if this is your table, something like this should hopefully work, if your table is called KD.CD4T:

Obviously you would have to set the q_value_thresh and the neg_log_FC_thresh and pos_log_FC_thresh before plotting!

volcano_plot <- ggplot(KD.CD4T,aes(avg_log2FC,-log(p_val_adj))) +
  geom_point(size=2) +
  geom_point(data = subset(KD.CD4T,(avg_log2FC < neg_log_FC_thresh) & (p_val_adj < q_value_thresh)),col="red") +
  geom_point(data = subset(KD.CD4T,(avg_log2FC > pos_log_FC_thresh ) & (p_val_adj < q_value_thresh) ),col="green") +
  geom_text_repel(
    data = subset(KD.CD4T, (avg_log2FC > pos_log_FC_thresh | avg_log2FC < neg_log_FC_thresh) & (p_val_adj < q_value_thresh)),
    aes(label = subset(KD.CD4T, (avg_log2FC > pos_log_FC_thresh | avg_log2FC < neg_log_FC_thresh) & (p_val_adj < q_value_thresh))$gene),
    size = 5,
    box.padding = unit(0.35, "lines"),
    point.padding = unit(0.3, "lines")
    ) +
    theme_light()

  volcano_plot

Hope this works! Let me know if there are still any issues!

levinhein commented 3 years ago

Got it now. Thank you for the help!

FloWuenne commented 3 years ago

Glad to hear you have it working now @levinhein :)!