Error: package 'GenomeInfoDb' could not be loaded

lisafournier commented 1 year ago

Hello,

I would like to run scUTRquant pipeline on my scRNAseq data. Before I start with my own data, I decided to familiarise myself with your running examples (I am trying with "neuron_1k_v3_bam").

However, I am getting some packages error: it seems that the bioconductor package GenomeInfoDbData cannot be installed. Here is the error I'm getting:

[INFO] Loading sample data... [INFO] Loaded 1 samples. Building DAG of jobs... Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'. Using shell: /usr/bin/bash Provided cores: 1 (use --cores to define parallelism) Rules claiming more threads will be scaled down. Singularity containers: ignored Job stats: job count min threads max threads

all 1 1 1 mtxs_to_sce_genes 1 1 1 mtxs_to_sce_txs 1 1 1 total 3 1 1

Select jobs to execute...

[Tue Oct 11 14:29:02 2022] rule mtxs_to_sce_txs: input: data/kallisto/utrome_mm10_v2/neuron_1k_v3_bam/txs.barcodes.txt, data/kallisto/utrome_mm10_v2/neuron_1k_v3_bam/txs.genes.txt, data/kallisto/utrome_mm10_v2/neuron_1k_v3_bam/txs.mtx, extdata/targets/utrome_mm10_v2/utrome.e30.t5.gc25.pas3.f0.9999.w500.gtf, extdata/targets/utrome_mm10_v2/utrome_txs_annotation.Rds, examples/neuron_1k_v3_bam/annots.csv output: data/sce/utrome_mm10_v2/neuron_1k_v3_bam.txs.Rds jobid: 8 reason: Missing output files: data/sce/utrome_mm10_v2/neuron_1k_v3_bam.txs.Rds wildcards: target=utrome_mm10_v2 resources: tmpdir=/tmp, mem_mb=16000

Activating conda environment: .snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf_

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:dplyr’:

combine, intersect, setdiff, union

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which.max, which.min

Loading required package: IRanges Loading required package: S4Vectors Loading required package: stats4

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:dplyr’:

first, rename

The following objects are masked from ‘package:base’:

expand.grid, I, unname

Attaching package: ‘IRanges’

The following objects are masked from ‘package:dplyr’:

collapse, desc, slice

Loading required package: GenomicRanges Loading required package: GenomeInfoDb Error: package or namespace load failed for ‘GenomeInfoDb’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘GenomeInfoDbData’ Error: package ‘GenomeInfoDb’ could not be loaded Execution halted

I then tried to install ‘GenomeInfoDbData’ manually in the environment _.snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf__ with conda install -c bioconda bioconductor-genomeinfodbdata: the installation seems to succeed, but the package is not visible in the library folder of the environment. Then rerunning the pipeline with snakemake --use-conda --configfile examples/neuron_1k_v3_bam/config.yaml yields the same error again.

Do you know how I could fix this issue?

Thank you very much, Lisa

mfansler commented 1 year ago

@lisafournier sorry you are encountering this issue. I'm happy to help troubleshoot it, but could use some additional info to assess what might be the cause. Could you please add the following details:

Conda/Mamba and OS info (e.g., output of mamba info)
List of environment packages with versions (e.g., mamba env export -p .snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf_)
Snakemake version

lisafournier commented 1 year ago

Thank you very much for your rapid answer!

- OS info: PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian

- Mamba info: conda version : 4.12.0 conda-build version : not installed python version : 3.9.12.final.0 virtual packages : linux=5.10.0=0 glibc=2.31=0 unix=0=0 archspec=1=x86_64 base environment : /temp/lfournier/miniconda3 (writable) conda av data dir : /temp/lfournier/miniconda3/etc/conda conda av metadata url : None channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /temp/lfournier/miniconda3/pkgs /home/lfournier/.conda/pkgs envs directories : /temp/lfournier/miniconda3/envs /home/lfournier/.conda/envs platform : linux-64 user-agent : conda/4.12.0 requests/2.28.1 CPython/3.9.12 Linux/5.10.0-18-amd64 debian/11 glibc/2.31 netrc file : None offline mode : False

Environment packages for ### env .snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf_: packages.txt
Snakemake version: 7.15.2

mfansler commented 1 year ago

Recreating the environment on the Mambaforge Docker, I get the same version builds for the Bioconductor packages, but am not encountering any issue loading GenomeInfoDb.

% docker run --rm -it --platform linux/amd64 condaforge/mambaforge:latest
(base) root@c5970fc06dbd:/# mamba env create -f 'https://github.com/Mayrlab/scUTRquant/raw/main/envs/bioconductor-sce.yaml'
(base) root@c5970fc06dbd:/# conda activate bioconductor-sce
(bioconductor-sce) root@c5970fc06dbd:/# R
> library(GenomeInfoDb)
> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS/LAPACK: /opt/conda/envs/bioconductor-sce/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GenomeInfoDb_1.30.0 IRanges_2.28.0      S4Vectors_0.32.4   
[4] BiocGenerics_0.40.0

loaded via a namespace (and not attached):
[1] compiler_4.1.3         GenomeInfoDbData_1.2.7 RCurl_1.98-1.9        
[4] bitops_1.0-7

Could you please try recreating the environment? That is, remove the folder and YAML file that Snakemake generated. Something like

rm -rf .snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf_
rm .snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf_.yaml

Then try running the pipeline again, which should recreate it. I believe the GenomeInfoDbData package has some downloading it does internally during installation and perhaps there was a glitch in that process.

I also reran without issue my pipeline demo, which pulled the latest Snakemake (v7.15.2), so the Snakemake doesn't seem like an issue.

The only other thing that stands out at the moment is the location of your Miniconda installation being under /temp. On some systems, /temp is actual RAM and not a physical disk, which might (?) pose problems for Conda when trying to create hardlinks. But that's a guess. One could try setting conda config --set always_copy true and recreating the environment, to see if it makes a difference. (and unless it does, make sure to change it back to false, otherwise you'll be wasting space)

Let me know how it goes.

lisafournier commented 1 year ago

Thank you for your suggestions. None of them has worked. However, I succeedeed in running the pipeline on another machine. So I manually copied the missing package GenomeInfoDbData in scUTRquant/.snakemake/conda/c59c6ad43352b6feaff96c1ce2893fcf_/lib/R/library and then it was running. I don't understand why my machine was not able to install it alone, but it is in any case not a problem coming from your side.

I'll close the issue, and thank you again for your precious help!

Lisa

Mayrlab / scUTRquant

Error: package 'GenomeInfoDb' could not be loaded #61