Closed npokorzynski closed 4 months ago
Yes, since release 4.10 (I believe) clusterProfiler
adds additional information that is present in the KEGG database to the output of gseKEGG
. This includes the category
and subcategory
of the pathway, and also the species name (although I am not sure whether this is actually due to changes made by KEGG themselves).
Anyway, to remove these from the results and subsequent plots I simply do this (quick and dirty) using gsub
(on your Mg10k
object; so before computing the pairwise similarity):
Mg10k@result$Description <- gsub(pattern = " - Salmonella enterica subsp. enterica serovar Typhimurium 14028S",
replacement = "",
Mg10k@result$Description,
fixed = TRUE)
That is very helpful, thanks! One related question - I was getting around this by manually adding y-axis labels [e.g., scale_y_discrete(labels = c(...))] and in that context it provides the labels as single lines of text, rather than the default which is to wrap the text. I find that the wrapping makes plot formatting very awkward because the word crowding is difficult to read. Is there a way to stop the text wrapping?
AFAIK it is not possible to stop the text wrapping. Yet, by setting the argument label_format
(default value = 30
) you can set the number of characters after which text wrapping should occur.
A simple way of not having text wrapping could be something like:
n.char <- max( nchar ( as.data.frame(pMg10k)$Description ) )
emapplot(pMg10k, showCategory = 10, label_format=n.char)
Works perfectly, thank you!
Hi,
I'm trying to run an old chunk of code for a clusterProfiler analysis, but with the new package version, the names of differentially expressed pathways include the entire species and strain name for my organism from the KEGG database. For an example, see the attached emap plot. Is there a way to either indicate to gseKEGG to only include the pathway name, or, alternatively, edit the titles in the gseKEGG object so that the plots do not include all of this additional text?
My code is below:
Rplot01.pdf