waldronlab / GSEABenchmarkeR

Reproducible GSEA Benchmarking
Other
13 stars 4 forks source link

Passing on additional arguments to user-defined methods #9

Closed lgeistlinger closed 4 years ago

lgeistlinger commented 4 years ago

The perm parameter and any other parameters that my method may require are not being passed from runEA to my method, so at the moment I have to make a wrapper function like so:

my_method <- function(se, gs){
  alpha <- 1
  perm <- 1000
  number_of_threads <- 1
  return(my_actual_method(se, gs, alpha=alpha, perm=perm, number_of_threads = number_of_threads))
}
lgeistlinger commented 4 years ago

I think that's a good workaround, but note that you can pass on additional arguments via the ... argument of runEA - which in turn passes arguments on to your method via the ... argument of EnrichmentBrowser::sbea (the underlying workhorse function of runEA).

However, this requires that the arguments of your function do not conflict with the arguments of runEA or sbea (which includes alpha and perm).

That means if you define your function via:

my_method <- function(se, gs, malpha, mperm, nr.threads)
{
# do something
}

you can plug it in via:

runEA(exp.list, my.method, malpha = 0.05, mperm = 100, nr.threads = 10)
annikagable commented 4 years ago

Ok, thanks for the explanation! However, additional arguments means that I can never run my method together with the already implemented methods, which I otherwise would like to do, for easy parallelization.

The reason is that I would get an error for passing additional arguments:

res <- runEA(GEO2KEGG[1:3], methods = c("ora", my_method, "gsea"), gs = kegg.gs, 
              save2file = F,  perm = c(0, 0, 1000),  mperm = 1000)
ora could not be evaluated on GSE1297
Error in .ora(1, se, cmat, perm, alpha, padj.method, ...) : 
  unused argument (mperm = 1000)
...

Also, perm has to have one item corresponding to each method, so I have to pass a placeholder permutation value (in this case, the second item of perm) which will not do anything, plus the mperm parameter, like I did above. This behavior is a little convoluted and not obvious from the documentation so far.

lgeistlinger commented 4 years ago

As explained on a related issue (https://github.com/waldronlab/GSEABenchmarkeR/issues/11#issuecomment-582075482), runEA did not support such mixing of pre-defined and user-defined methods until here.

In light of the related issue https://github.com/waldronlab/GSEABenchmarkeR/issues/11, I just added support for providing a mix of pre- and user-defined methods, and I'll have to do some more testing what are the implications for additional arguments to user-defined functions.

A dedicated section in the vignette on how to plugin user-defined methods is in preparation and will be added once I have a good solution for passing on arguments in such a mixed case. Stay tuned.

annikagable commented 4 years ago

Thanks! I think not supporting mixing of pre-defined and user-defined methods is also valid, but then this should be clearly documented and raising an error if a user tries to do that. Whatever you decide on, the same principle should apply to evalTypeIError(), which currently does not let the me mix pre-defined and user-defined methods.

lgeistlinger commented 4 years ago

Correct, these changes to runEA need to also be ported to evalTypeIError.

lgeistlinger commented 4 years ago

Hi @AnnikaGable - thanks again for your input on these matters!

  1. The changes for runEA described here https://github.com/waldronlab/GSEABenchmarkeR/issues/11#issuecomment-582075482 have now also been adapted for evalTypeIError.

Thus you can now also choose from the three options (character vector of pre-defined methods, a user-defined function, or a named list containing a mix of both).

An example:

# applying a mix of pre-defined and user-defined methods
methods <- list(camera = "camera", my.method.name = myMethod)
res <- evalTypeIError(methods, geo2kegg[1:2], gs = kegg.gs,
                        save2file = TRUE, out.dir = "~/test", tI.perm = 3)

This has been clarified in the man pages of runEA and evalTypeIError (see methods argument and examples section).

lgeistlinger commented 4 years ago
  1. I did a bit of a larger re-factoring to support additional arguments to individual methods.

The basic idea is that the arguments provided to runEA will now be matched to the arguments of individual methods (including user-defined functions). Note that runEA is just an iterator over datasets and methods, delegating to EnrichmentBrowser::sbea which does the actual computation for one dataset and one method at a time. More specifically, we thus match arguments of individual methods with the actual arguments provided to EnrichmentBrowser::sbea via runEA.

Therefore, we now distinguish between

  1. mandatory reserved arguments - which a method must have (se and gs),
  2. optional reserved arguments - which a method can have (perm and alpha),
  3. additional arguments - any further arguments a method might need.

Here's an example:

  1. Define your method
    myMethod <- function(se, # required 
                     gs, # required 
                     perm, # optional, will be provided via runEA / sbea's "perm" argument
                     my.method.arg1, # optional, will be provided via runEA / sbea's "..." arguments
                     my.method.arg2  # same as for my.method.arg1
                    )
    {
    # compute and return a gene set ranking as before
    }
  2. Execute your method with other pre-defined methods
    > res <- runEA(exp.list = geo2kegg,
               methods = list(ora = "ora", my.method.name = myMethod, gsea = "gsea"),
               gs = kegg.gs,
               perm = c(0, 100, 50),
               my.method.arg1 = TRUE,
               my.method.arg2 = 5)

This will carry out ORA with 0 permutations, your method with 100 permutations, and GSEA with 50 permutations.

lgeistlinger commented 4 years ago

If you wanted to test your method / certain argument configurations first, it is typically a good idea to first inspect the underlying unit (= EnrichmentBrowser::sbea on a single SummarizedExperiment):

library(EnrichmentBrowser)

# carry out your method
mres <- sbea(myMethod,
             se = geo2kegg[[1]],
             gs = kegg.gs,
             perm = 100,
             my.method.arg1 = TRUE,
             my.method.arg2 = 5)

# inspect the resulting gene set ranking
gsRanking(mres, signif.only = FALSE)
lgeistlinger commented 4 years ago

To try out these changes you will need

EnrichmentBrowser v2.17.5

and

GSEABenchmarker v1.7.6

Both are now available in the devel branch of Bioconductor (3.11). If you have a separate R-devel (https://cran.r-project.org/src/base-prerelease/) installation, you can install them via

pkgs <- c("EnrichmentBrowser", "GSEABenchmarkeR")
BiocManager::install(pkgs, version = "3.11")

If you prefer working within the release version of R / Bioconductor, then you'll need to directly install from github via:

BiocManager::install("lgeistlinger/EnrichmentBrowser")
BiocManager::install("waldronlab/GSEABenchmarkeR")

Hope that helps?

annikagable commented 4 years ago

Those sound like really great changes which should really improve user-friendliness! I was able to completely benchmark my method already with the existing version, but I'm happy to try the new runEA as well!

UPDATE Tried it, everything works as expected so far, both for runEA and evalTypeIError.