lgeistlinger / EnrichmentBrowser

Seamless navigation through combined results of set-based and network-based enrichment analysis
20 stars 11 forks source link

Request: ability to change output directory and .html file name #8

Closed jdrnevich closed 6 years ago

jdrnevich commented 6 years ago

Hi Ludwig,

I'd like to leave a formal request to be able to change the output directory and .html file name in the call to eaBrowse() and propagated through to ebrowser(). It's tedious to move the thousands of output files from a single analysis to a new location dedicated to that analysis. Also, the .html file names appear to default to the method chosen (ora or gsea), which means I can't run gsea on kegg pathways and then another class of gene sets like MSigDB on the same analysis without manually going and re-naming the html file before doing the second analysis.

Thanks! Jenny

lgeistlinger commented 6 years ago

Hi Jenny,

The output destination is defined via:

> EnrichmentBrowser::configEBrowser("OUTDIR.DEFAULT")
[1] "/Users/ludwig/Library/Application Support/EnrichmentBrowser/results"

and can be accordingly updated via:

> EnrichmentBrowser::configEBrowser("OUTDIR.DEFAULT", "/my/preferred/outdir")

See also Section 10 in http://www.bioconductor.org/packages/release/bioc/vignettes/EnrichmentBrowser/inst/doc/EnrichmentBrowser.pdf

After running the first analysis, it's a good idea to move the subdirectory reports to another location (mv reports /my/location/gsea_kegg) before running another analysis.

Does this help? I could also add timestamps for each analysis ...

jdrnevich commented 6 years ago

I guess I hadn’t made it to section 10 in the vignette and it wasn’t in your BioC2018 workshop. Using configuration parameters is just different than many other functions that I am used to where you can change on the fly in the specific function call, which also provides good documentation and tracking when a client asks me about a specific file I made for them at some point in the past! Here are few specific suggested improvements, from easiest to hardest, that wouldn’t require changing your whole config parameters system:

  1. In the help page to ebrowser() and eaBrowse(), the Value section should be changed to something like: The main html report and associated files are written to configEBrowser("OUTDIR.DEFAULT"). See ?configEBrowser to change the location. If browse= TRUE [html.only = FALSE for eaBrowse!], the html file will automatically be opened. I would have noticed this here, and the output location should be documented somewhere other than the function message.
  2. I would lobby for OUTDIR.DEFAULT to point to the current working directory /results, as that seems more standard
  3. You could allow optional arguments of out.dir = NULL and shortName = NULL in eaBrowse (and allow to pass through to eaBrowse if given to ebrowser) that would only add a couple of changes to the code at the begininng:

if (is.null(out.dir)) out.dir <- configEBrowser("OUTDIR.DEFAULT") if (is.null(shortName)) shortName <- method

and then couple of changes of method to shortName when needed. I can easily see two:

htmlRep <- ReportingTools::HTMLReport(shortName = shortName, title = paste(toupper(method), configEBrowser("RESULT.TITLE"), sep = " - "), basePath = out.dir, reportDirectory = "reports") and message(paste0("HTML report: ", shortName, ".html"))

  1. Having a reports subdirectory inside a results directory seems unduly nested but there might be a good reason for it?

Cheers!

lgeistlinger commented 6 years ago

Fair suggestions.

  1. Included in the help pages as suggested.

  2. I've recently changed the default output destination from the cwd to rappdirs::user_data_dir("EnrichmentBrowser").

Although the cwd is an intuitive choice (and I've heard other users to also lobby for that), there are some good reason for using this designated output location instead.

a) when not running R from a terminal but rather via a third-party application (such as RGui, RStudio, ...), it might be less clear for the user what the actual cwd is, b) when users (accidentally or not) are running R within an installation directory (/bin and alikes), you are risking to overwrite installation files that might cause all kind of troubles, c) the developer community widely agrees on rappdirs::user_data_dir as the default output destination for R packages.

  1. Going to include.

  2. Different functions of the EnrichmentBrowser are writing different types of results and the reports directory is reserved for html reports. For example, downloadPathways writes pathway files downloaded from KEGG, and compileGRN writes the constructed gene regulatory network to this default output destination. However, this is becoming more and more a legacy from earlier versions of the package. Looking into that.

In general, configEBrowser works like par to customize your plotting device (or like options to configure your R working environment). It allows to relatively conveniently deal with package-wide constants that affect more than one function, for which there are several in the EnrichmentBrowser. For example, the various standardized column names. However, for ebrowser and eaBrowse, I agree that you wanna be able to overwrite this directly in the function call.

lgeistlinger commented 6 years ago

In the new devel version of the EnrichmentBrowser (v2.11.12):

http://www.bioconductor.org/packages/devel/bioc/html/EnrichmentBrowser.html

the functions eaBrowse and ebrowser now take accordingly two additional arguments:

  1. out.dir
  2. report.name

for example:

ebrowser(
    meth="ora", perm=0, exprs = allSE,
    gs=kegg.gs, org="hsa", nr.show=3,
    out.dir="~/oraReport", report.name="myReport") 

Is that what you were looking for?