waldronlab / GSEABenchmarkeR

Reproducible GSEA Benchmarking
Other
13 stars 4 forks source link

Naming user-defined functions in console/file output #11

Closed lgeistlinger closed 4 years ago

lgeistlinger commented 4 years ago

The return value of runEA includes all the other methods as $padog, $ora, but then for my method it uses $’function(se,gs){ […] ))}', so my complete literal function. Is there some way I can specify how I want my method to be called, also in the data that is written to file?

lgeistlinger commented 4 years ago

Hi @AnnikaGable

Technically it brings some difficulties to mix character scalars (such as "ora") and user-defined functions in the methods argument of runEA.

Therefore, I assumed the input to be indeed either different pre-defined methods or a user-defined function, and subsequently use readResults to merge results from different runs.

(Note that these considerations also apply here: https://github.com/waldronlab/GSEABenchmarkeR/issues/9)

That said, I understand that this is not necessarily most convenient from a user perspective. I have thus extended / formalized the methods argument in the latest version of GSEABenchmarkeR (v1.7.4), which can now either be

Let's consider this example as input:

# preparing two datasets from the GEO2KEGG compendium
geo2kegg <- loadEData("geo2kegg", nr.datasets = 2, cache = FALSE)
geo2kegg <- maPreproc(geo2kegg)
geo2kegg <- runDE(geo2kegg)

# getting human KEGG gene sets
kegg.gs <- EnrichmentBrowser::getGenesets(org = "hsa", db = "kegg")

First option: a character vector of predefined methods (as before):

# applying two methods to two datasets
res <- runEA(geo2kegg, methods = c("ora", "camera"), gs = kegg.gs,
                perm = 0, save2file = TRUE, out.dir = "~/test")

Second option: a user-defined function (as before)

# applying a user-defined enrichment method:
dummySBEA <- function(se, gs)
{
    sig.ps <- sample(seq(0, 0.05, length=1000), 5)
    nsig.ps <- sample(seq(0.1, 1, length=1000), length(gs)-5)
    ps <- sample(c(sig.ps, nsig.ps), length(gs))
    names(ps) <- names(gs)
    return(ps)
} 

res <- runEA(geo2kegg, methods = dummySBEA, gs = kegg.gs,
             save2file = TRUE, out.dir = "~/test")

Third option: a named list containing pre-defined and/or user-defined enrichment methods (new):

# applying a mix of pre-defined and user-defined methods
methods <- list(camera = "camera", dummySBEA = dummySBEA)
res <- runEA(geo2kegg, methods, gs = kegg.gs,
                save2file = TRUE, out.dir = "~/test")

Note that the names of the list determine the name of the method in the result object as well as the result files. Let me know whether this is what you had in mind. Thanks!

annikagable commented 4 years ago

Yes, that would be a solution! Do I understand correctly that in the second option, the user-defined function would still be called "method" in the console/file output? In this case, option 2 could be omitted/merged with option 3.

lgeistlinger commented 4 years ago

Option 2 is mainly for backward-compatibility, and it is correct that option 3 subsumes option 2 to some extent as it also allows to provide just a single user-defined function in the same way:

runEA(exp.list, methods = list(my.method.name = myMethod), ...)

However, in this case of a single function it might be convenient to just provide the function without wrapping it in a list, generically naming the method in the output then method (as there is no reliable way to obtain the method name from a function itself), and it is unambiguous anyhow, as just one method were provided.