Closed hpages closed 1 year ago
Hi @hpages thanks for the detailed diagnostics. The mysterious: "Error: processing vignette 'accessing_ensembl.Rmd' failed with diagnostics: object 'ensembl' not found
" from the biomaRt build report wasn't much help to me!
Have two entrie for the ensembl-marts-109
is not expected. The code in .listEnsembl()
is supposed to identify if the cache exists and either use it if it's "fresh" or remove and reproduce the cached version if it's "stale".
I notice that the two entries were generated only 1 second apart. Is it possible that two separate processes are calling .listEnsembl()
with the second happening before the cache entry has been generated by the first? Or maybe the transient Ensembl flakiness has introduced some invalid state?
I'll add some error handling to check for the multiple entry case and simply delete everything, I don't think there's any real drawback to doing that, but it'd be nice to know how we ended up with two entries in the first place.
Is it possible that two separate processes are calling
.listEnsembl()
with the second happening before the cache entry has been generated by the first?
Of course. During the daily builds, each build machine runs dozens of R CMD build
or R CMD check
commands in parallel. This could also happen on the user machine, although less likely there. I remember seeing some discussion about BioFileCache and concurrent write access to the cache but I don't know the details. This for example could be relevant: https://github.com/Bioconductor/BiocFileCache/issues/42 What we observe here could just be a manifestation of that.
I'll add some error handling to check for the multiple entry case and simply delete everything
Sounds good, thanks!
It looks like my patch has at least resolved this specific issue (I'm going to assume that every package on the Linux builder reporting warnings isn't my fault!)
Thanks Mike!
I'm going to assume that every package on the Linux builder reporting warnings isn't my fault!
This was an error, not a warning.
@grimbough Hi Mike,
biomaRt:::.listEnsembl()
fails at the moment on Bioconductor builder nebbiolo2, with the following error:This breaks
listEnsembl()
anduseEnsembl()
which both usebiomaRt:::.listEnsembl()
internally. See for example the current CHECK error for GenomicFeatures on the BioC 3.16 daily builds.This error seems to be caused by the fact that
biomaRt:::.listEnsembl()
obtainscache_entry
with:which can sometimes return more than one cache entry, like on nebbiolo2 where it currently returns the 2 following entries:
I don't know if having more than 1 entry for
ensembl-marts-109
is a "normal state" for the cache. If not then maybe the issue needs to be reported to the BiocFileCache maintainers (@lshep). But if it is, then it seems thatbiomaRt:::.listEnsembl()
would need to be modified to handle this situation.@jwokaty @lshep Let's not flush biomaRt's cache on nebbiolo2 (located at
~biocbuild/.cache/biomaRt
) until this is sorted out.Thanks all!
H.
sessionInfo():