Open gwangjinkim opened 6 years ago
Dear Guangchuan,
Thank you for clusterProfiler! It is a great work!
To my issue above: I could help myself out with a dirty hack:
####################################################
# to make possible `simplify` on gseaResult object
# I just inserted following code into the script:
# And then, `simplify()` can be applied on `gseaResult` objects of GO GSEA analyses (`gseGO()`).
require(magrittr) # because of the %<>% operator which is a pipe %>% with reassigning back `<-`
setMethod("simplify", signature(x="gseaResult"),
function(x, cutoff=0.7, by="p.adjust", select_fun=min, measure="Wang", semData=NULL) {
if (!x@setType %in% c("BP", "MF", "CC"))
stop("simplify only applied to output from enrichGO...")
x@result %<>% simplify_internal(., cutoff, by, select_fun,
measure, x@setType, semData)
return(x)
}
)
# from:
# https://github.com/GuangchuangYu/clusterProfiler/blob/master/R/simplify.R
# I added `packagename::` in front of some function names, since this code is outside the package
# and does not "see" some of the packages/package functions imported
# to the package environment
# but basically this is the unchanged `simplify_internal` function definition.
simplify_internal <- function(res, cutoff=0.7, by="p.adjust", select_fun=min, measure="Rel", ontology, semData) {
if (missing(semData) || is.null(semData)) {
if (measure == "Wang") {
semData <- GOSemSim::godata(ont = ontology)
} else {
stop("godata should be provided for IC-based methods...")
}
} else {
if (ontology != semData@ont) {
msg <- paste("semData is for", semData@ont, "ontology, while enrichment result is for", ontology)
stop(msg)
}
}
sim <- GOSemSim::mgoSim(res$ID, res$ID,
semData = semData,
measure=measure,
combine=NULL)
## to satisfy codetools for calling gather
go1 <- go2 <- similarity <- NULL
sim.df <- as.data.frame(sim)
sim.df$go1 <- row.names(sim.df)
sim.df <- tidyr::gather(sim.df, go2, similarity, -go1)
sim.df <- sim.df[!is.na(sim.df$similarity),]
## feature 'by' is attached to 'go1'
sim.df <- merge(sim.df, res[, c("ID", by)], by.x="go1", by.y="ID")
sim.df$go2 <- as.character(sim.df$go2)
ID <- res$ID
GO_to_remove <- character()
for (i in seq_along(ID)) {
ii <- which(sim.df$go2 == ID[i] & sim.df$similarity > cutoff)
## if length(ii) == 1, then go1 == go2
if (length(ii) < 2)
next
sim_subset <- sim.df[ii,]
jj <- which(sim_subset[, by] == select_fun(sim_subset[, by]))
## sim.df <- sim.df[-ii[-jj]]
GO_to_remove <- c(GO_to_remove, sim_subset$go1[-jj]) %>% unique
}
res[!res$ID %in% GO_to_remove, ]
}
########################################
After that, it is possible to call simplify(gseGO.result, cutoff = 0.7, by = "p.adjust", select_fun = min)
.
thanks for your effort. Will look into it.
Welcome!
And thanks for your effort to create clusterProfiler! And ChIPseeker and so many other super-useful repositories! Amazing!!
I have to learn a lot still in R and bioinformatics in general (just changed 3 years ago from wetlab to bioinformatics). On what kind of projects do you work nowadays? Could I contribute to sth? I need a mentor, I realized.
Best, Gwang-Jin
On Fri, Sep 14, 2018 at 10:04 AM Guangchuang Yu notifications@github.com wrote:
thanks for your effort. Will look into it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/GuangchuangYu/clusterProfiler/issues/162#issuecomment-421266085, or mute the thread https://github.com/notifications/unsubscribe-auth/AfDrF9BHCmES8u5gd9O1URVhYlDg7yJ6ks5ua2MEgaJpZM4WlpIT .
thanks @gwangjinkim.
You are welcome to contribute to my github repos.
Thank you! Sure!
A question because of the 'simplify()' function in 'clusterProfiler':
I found a way using GO.db
to check, whether a GO term is terminal or not.
https://support.bioconductor.org/p/35789/
I see the core of simplify()
is the mgoSim()
function.
https://github.com/GuangchuangYu/clusterProfiler/blob/master/R/simplify.R
However, I see that its default for organism="human"
.
If I enter mouse GO ids, will it simplify for "human"?
On Tue, Sep 18, 2018 at 4:38 AM Guangchuang Yu notifications@github.com wrote:
thanks @gwangjinkim https://github.com/gwangjinkim.
You are welcome to contribute to my github repos.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GuangchuangYu/clusterProfiler/issues/162#issuecomment-422234556, or mute the thread https://github.com/notifications/unsubscribe-auth/AfDrFx5tFiJy_1oxqRmuXsXST0SQUlKFks5ucFy9gaJpZM4WlpIT .
Sorry for my ignorance, but ... are GO ids of the mouse basically/thematically the same for humans? so that same number GO ids for mouse and human "mean" the same function annotated by GO? Or do different numbers mean different terms in the different species? I try to find an answer in Google ... but it takes time ...
On Fri, Sep 21, 2018 at 12:34 PM Gwang Jin Kim gwang.jin.kim.phd@gmail.com wrote:
Thank you! Sure!
A question because of the 'simplify()' function in 'clusterProfiler':
I found a way using
GO.db
to check, whether a GO term is terminal or not. https://support.bioconductor.org/p/35789/I see the core of
simplify()
is themgoSim()
function. https://github.com/GuangchuangYu/clusterProfiler/blob/master/R/simplify.RHowever, I see that its default for
organism="human"
. If I enter mouse GO ids, will it simplify for "human"?On Tue, Sep 18, 2018 at 4:38 AM Guangchuang Yu notifications@github.com wrote:
thanks @gwangjinkim https://github.com/gwangjinkim.
You are welcome to contribute to my github repos.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GuangchuangYu/clusterProfiler/issues/162#issuecomment-422234556, or mute the thread https://github.com/notifications/unsubscribe-auth/AfDrFx5tFiJy_1oxqRmuXsXST0SQUlKFks5ucFy9gaJpZM4WlpIT .
On Fri, Sep 21, 2018 at 12:41 PM Gwang Jin Kim gwang.jin.kim.phd@gmail.com wrote:
Sorry for my ignorance, but ... are GO ids of the mouse basically/thematically the same for humans? so that same number GO ids for mouse and human "mean" the same function annotated by GO? Or do different numbers mean different terms in the different species? I try to find an answer in Google ... but it takes time ...
On Fri, Sep 21, 2018 at 12:34 PM Gwang Jin Kim < gwang.jin.kim.phd@gmail.com> wrote:
Thank you! Sure!
A question because of the 'simplify()' function in 'clusterProfiler':
I found a way using
GO.db
to check, whether a GO term is terminal or not. https://support.bioconductor.org/p/35789/I see the core of
simplify()
is themgoSim()
function. https://github.com/GuangchuangYu/clusterProfiler/blob/master/R/simplify.RHowever, I see that its default for
organism="human"
. If I enter mouse GO ids, will it simplify for "human"?On Tue, Sep 18, 2018 at 4:38 AM Guangchuang Yu notifications@github.com wrote:
thanks @gwangjinkim https://github.com/gwangjinkim.
You are welcome to contribute to my github repos.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GuangchuangYu/clusterProfiler/issues/162#issuecomment-422234556, or mute the thread https://github.com/notifications/unsubscribe-auth/AfDrFx5tFiJy_1oxqRmuXsXST0SQUlKFks5ucFy9gaJpZM4WlpIT .
is there any progress on this issue? i tried out the 'hack' but it seems to no longer work for the newest version :(
Sigh.
> class(x)
"enrichResult"
> x <- pairwise_termsim(x)
> simplify(x)
Error in .local(x, ...) :
simplify only applied to output from gsegO and enrichGO...
currently,
simplify()
is not realized for gseaResult objects (aftergseGO()
). Is there are good theoretical reason not to allow simplification of this result? Else I would love to see it realized for gseaResult objects ...