Open heathergeiger opened 3 years ago
Nothing too complicated. When de.method="wilcox"
or "t"
, the package uses scran's functions to perform the pairwise t-tests or Wilcoxon tests in an efficient manner; so to use that functionality, you'll need scran installed, as the error message suggests. It's not installed by default to keep SingleR's dependencies low, given that the default method on the default bulk references doesn't require scran.
So just BiocManager::install('scran')
and you'll be good to go.
Perhaps we should add an if !require("scran") { stop('scran package is required for de.method="wilcox" or "t"') }
? I believe this require()
conditional method is the recommendation from Bioconductor's developer guidelines, but I'm curious about your thoughts, @LTLA, as there is the downside of then loading that entire package in all de.method="wilcox"
or "t"
cases!
Hm. Traditionally I have always considered the error message out of ::
to be satisfactory. Also it was a pain to have to write these protective clauses every time I used a Suggested package.
The best of both worlds would be to write a little getter function along the lines of:
checkForPackage <- function(pkg) {
if (!requireNamespace(pkg, quietly=TRUE)) {
# Perhaps have some smarter checks about whether something is
# a Bioconductor package, but we could also just trust the developer here.
stop(pkg, " is not installed, run BiocManager::install('", pkg, "')")
}
}
which avoids the need to write all this crap everytime we use ::
for a Suggested method. This also avoids attaching packages on the search path, only loading their namespaces instead.
Would be nice if we can get it to live in some core package, then I could use it for all my packages.
Such a base function sounds pretty good to me! Would allow me to remove the 5 similar, though each manually made more specific, functions from dittoSeq.
Such a function could potentially also take in multiple pkgs
for cases when 2 or more are actually needed for the specific action.
Also, yes forgot about but totally meant *requireNamespace()
!
Probably a useful utility, although it sort of seems like one is patching an imperfect error message, with a better solution being a better error message?
One thing about the above is that it doesn't distinguish between types of errors (e.g., when a package fails to load because the installation has become corrupted somehow). One could be more clever, since the error is actually classed
> x = tryCatch(foo::bar(), error = identity)
> x
<packageNotFoundError in loadNamespace(x): there is no package called 'foo'>
So something like
tryCatch({
foo:bar()
}, packageNotFoundError = function(e) {
pkg <- e$package
stop(
"package '", pkg, "' not found; ",
'install with `BiocManager::install("', pkg, '")`',
call. = FALSE
)
})
which also works for loadNamespace("foo")
but not requireNamespace("foo")
.
Candidate locations are in BiocManager or maybe BiocGenerics; it's currently unusual for a package to Depend: or Import: BiocManager.
BiocManager seems like the best place for this to live. The package has minimal dependencies and it must be installed by default before SingleR anyway, so I wouldn't consider it a real +1 to my dependency count.
But BiocManager is really for managing installations. Personally, I don't have BiocManager loaded when I do analysis, but I would want this fix to be available in that case.
Having a set of utility functions for dealing with Suggested packages seems worthwhile. I know this is suggesting something slightly different from what is being suggested here.
So basically the proposal is to replace the "there is no package called ‘scran’"
error message with the more user-friendly "you don't have package 'scran'; install it with blah blah"
.
Personally I think that the specific error message suggested by @dtm2451 (scran package is required for de.method="wilcox" or "t"
) still has more value because it explains why the package is suddenly needed. It's always a little bit of an annoyance to discover that you miss a package in the middle of an analysis so it's nice to understand why this happens.
Any thoughts @hpages on a home for this? I'm not sure, as Kasper notes, that BiocManager is the right place for it.
I sometimes have tasks requiring multiple suggested packages, so would definitely vote for something which can check a set of packages. Perhaps the algorithm framework could be something like this:
suggested_pkgs_check <- function(pkgs, fxnality_message = "this functionality") {
pkgs_missing <- vapply(
pkgs, function(pkg) {
# Martin's `tryCatch` suggestion modified to allow multiple packages,
# OR a `requireNamespace` check
# output: a logical for each pkg of whether it is missing (TRUE) vs available (FALSE)
}, FUN.VALUE = logical(1)
)
if (any(pkgs_missing)) {
stop(
"Package(s) ", paste0(pkgs[pkgs_missing], collapse = ", "),
" unavailable, but required for ", fxnality_message,
". Install with `BiocManager::install(c('",
paste0(pkgs[pkgs_missing], collapse = "', '"),
"')`.",
call. = FALSE
)
}
}
Then, 1) multiple packages could be checked (so user's don't install a single package, and start rerunning their pipeline only to get an error at the same point due to a different package needed for the same step!) && 2) my specific message suggestion can be accommodated (yet even if a developer doesn't bother to add there own custom fxnality_message
here, the idea that new packages are needed for the specific, currently requested, functionality is still given!).
I wonder if we need to distinguish between reasons that a function may be inaccessible? The path forward if a package has become corrupted is still to reinstall, no?
Re the home for this function: I don't have anything to add that hasn't already been said.
It's about installing missing packages (and the error message explicitly instructs the user to use BiocManager to do so), which makes BiocManager kind of a natural place for it.
I am currently trying to run SingleR vs. the counts and labels available here:
http://geschwindlab.dgsom.ucla.edu/pages/codexviewer
Here is my code to get a log-normalized expression matrix and labels in the appropriate format for SingleR.
I then ran SingleR like so, where "norm_counts" is the result of run GetAssayData for slot="data" on the Seurat object containing the test data.
But I am getting the following error: "Error in loadNamespace(name) : there is no package called ‘scran’".
Any idea what is going on here? My sessionInfo() result is below. SingleR worked fine with a bulk reference, so the issue appears to be specific to when I use a single-cell reference with the appropriate change to "de.method".