bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
992 stars 354 forks source link

Error in RNA-Seq pipeline #3608

Closed kokyriakidis closed 2 years ago

kokyriakidis commented 2 years ago

Hi!

Q1: Is there a reason that bcbio still uses salmon v1.4.0?

[2022-01-21T01:55Z] Version Info: ### PLEASE UPGRADE SALMON ###
[2022-01-21T01:55Z] ### A newer version of salmon with important bug fixes and improvements is available. ####
[2022-01-21T01:55Z] ###
[2022-01-21T01:55Z] The newest version, available at https://github.com/COMBINE-lab/salmon/releases
[2022-01-21T01:55Z] contains new features, improvements, and bug fixes; please upgrade at your
[2022-01-21T01:55Z] earliest convenience.
[2022-01-21T01:55Z] ###
[2022-01-21T01:55Z] Sign up for the salmon mailing list to hear about new versions, features and updates at:
[2022-01-21T01:55Z] https://oceangenomics.com/subscribe

Q2: I get the following error:

[2022-01-21T02:00Z] Timing: estimate expression (single threaded)
[2022-01-21T02:00Z] Combining count files into /media/kokyriakidis/PRODUCTION/EDITING/bcbio/AD/work/htseq-count/combined.counts.
[2022-01-21T02:01Z] tx2gene file /media/kokyriakidis/PRODUCTION/EDITING/bcbio/AD/work/inputs/transcriptome/hg38-tx2gene.csv created from /media/kokyriakidis/RED/BCBIO/RESOURCES/bcbio/genomes/Hsapiens/hg38/rnaseq/ref-transcripts.gtf.
[2022-01-21T02:01Z] Combining tx2gene CSV files.
[2022-01-21T02:01Z] Loading tximport.
[2022-01-21T02:01Z] Error in library(tidyverse) : there is no package called ‘tidyverse’
[2022-01-21T02:01Z] Execution halted
[2022-01-21T02:01Z] Uncaught exception occurred
...
subprocess.CalledProcessError: Command '/media/kokyriakidis/RED/BCBIO/RESOURCES/bcbio/anaconda/bin/Rscript --vanilla -e library(tidyverse);salmon_files = list.files("/media/kokyriakidis/PRODUCTION/EDITING/bcbio/AD/work/salmon", pattern="quant.sf", recursive=TRUE, full.names=TRUE);tx2gene = readr::read_csv("/media/kokyriakidis/PRODUCTION/EDITING/bcbio/AD/work/inputs/transcriptome/tx2gene.csv", col_names=c("transcript", "gene")); samples = basename(dirname(salmon_files));names(salmon_files) = samples;txi = tximport::tximport(salmon_files, type="salmon", tx2gene=tx2gene, countsFromAbundance="lengthScaledTPM", dropInfReps=TRUE);readr::write_csv(round(txi$counts) %>% as.data.frame() %>% tibble::rownames_to_column("gene"), "/media/kokyriakidis/PRODUCTION/EDITING/bcbio/AD/work/bcbiotx/tmph9l25_kn/tximport-counts.csv");readr::write_csv(txi$abundance %>% as.data.frame() %>% tibble::rownames_to_column("gene"), "/media/kokyriakidis/PRODUCTION/EDITING/bcbio/AD/work/bcbiotx/tmpimej1vy0/tximport-tpm.csv");
Error in library(tidyverse) : there is no package called ‘tidyverse’
Execution halted
' returned non-zero exit status 1.

I do not know if there is a problem with my installation or tidyverse is missing from the installed depedencies.

I now try a clean install but I get this error:

# Installing into conda environment default: age-metasv, atropos, bamtools, bamutil, bbmap, bcftools=1.13, bedops, bio-vcf, biobambam, bowtie, break-point-inspector, bwa, cage, cnvkit, coincbc, cramtools, deeptools, express, fastp, fastqc, geneimpacts, genesplicer, gffcompare, goleft, grabix, gsort, gvcfgenotyper, h5py=3.3, hdf5=1.10, hisat2, hmmlearn, htseq, impute2, kallisto=0.46, kraken, ldc, macs2, maxentscan, mbuffer, minimap2, mintmap, mirdeep2, mirtop, moreutils, multiqc, multiqc-bcbio, ngs-disambiguate, novoalign, oncofuse, pandoc, parallel, pbgzip, peddy, pizzly, pythonpy, qsignature, rapmap, rtg-tools, sailfish, salmon, samblaster, scalpel, seq2c<2016, seqbuster, seqcluster, seqtk, sickle-trim, simple_sv_annotation, singlecell-barcodes, snap-aligner=1.0dev.97, snpeff=5.0, solvebio, spades, star=2.6.1d, stringtie, subread, survivor, tdrmapper, tophat-recondition, trim-galore, ucsc-bedgraphtobigwig, ucsc-bedtobigbed, ucsc-bigbedinfo, ucsc-bigbedsummary, ucsc-bigbedtobed, ucsc-bigwiginfo, ucsc-bigwigsummary, ucsc-bigwigtobedgraph, ucsc-bigwigtowig, ucsc-fatotwobit, ucsc-gtftogenepred, ucsc-liftover, ucsc-wigtobigwig, umis, vardict-java, vardict<=2015, variantbam, varscan, vcfanno, viennarna, vqsr_cnn, wham, ipyparallel=6.3.0, ipython-cluster-helper=0.6.4=py_0, ipython=7.29.0, ipython_genutils=0.2.0=py37_0, traitlets=4.3.3, anaconda-client, awscli, bzip2, ncurses, nodejs, p7zip, readline, s3gof3r, xz, perl-app-cpanminus, perl-archive-extract, perl-archive-zip, perl-bio-db-sam, perl-cgi, perl-dbi, perl-encode-locale, perl-file-fetch, perl-file-sharedir, perl-file-sharedir-install, perl-ipc-system-simple, perl-lwp-protocol-https, perl-lwp-simple, perl-sanger-cgp-battenberg, perl-statistics-descriptive, perl-time-hires, perl-vcftools-vcf, bioconductor-annotate, bioconductor-apeglm, bioconductor-biocgenerics, bioconductor-biocinstaller, bioconductor-biocstyle, bioconductor-biostrings, bioconductor-biovizbase, bioconductor-bsgenome.hsapiens.ucsc.hg19, bioconductor-bsgenome.hsapiens.ucsc.hg38, bioconductor-bubbletree, bioconductor-cn.mops, bioconductor-copynumber, bioconductor-degreport, bioconductor-deseq2, bioconductor-dexseq, bioconductor-dnacopy, bioconductor-genomeinfodb, bioconductor-genomeinfodbdata, bioconductor-genomeinfodbdata, bioconductor-genomicranges, bioconductor-iranges, bioconductor-limma, bioconductor-org.hs.eg.db, bioconductor-purecn>=2.0.1, bioconductor-rhdf5, bioconductor-rtracklayer, bioconductor-rtracklayer, bioconductor-summarizedexperiment, bioconductor-titancna, bioconductor-txdb.hsapiens.ucsc.hg19.knowngene, bioconductor-txdb.hsapiens.ucsc.hg38.knowngene, bioconductor-tximport, bioconductor-vsn, r-base=4.1.1=hb67fd72_0, r-chbutils, r-deconstructsigs, r-devtools, r-dplyr, r-dt, r-ggdendro, r-ggplot2, r-ggrepel, r-gplots, r-gsalib, r-janitor, r-knitr, r-optparse, r-pheatmap, r-plyr, r-pscbs, r-reshape, r-rmarkdown, r-rsqlite, r-sleuth, r-snow, r-stringi, r-tidyverse, r-viridis, r-wasabi, r=4.1=r41hd8ed1ab_1004, xorg-libxt
Encountered problems while solving:
  - package 'ipython_genutils-0.2.0-py37_0' is excluded by strict repo priority
  - package perl-vcftools-vcf-0.1.15-pl526_2 requires perl >=5.26.2,<5.27.0a0, but none of the providers can be installed

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Solving environment: ...working... 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.

Only after running it with sudo e.g. sudo python3 bcbio_nextgen_install.py /bcbio --tooldir=/bcbio/tools --nodata managed to make it work and move one.

naumenko-sa commented 2 years ago

1) We are not pinning salmon version: https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L111 In my fresh installation:

$ which salmon
/n/app/bcbio/1.2.9/anaconda/bin/salmon
$ salmon -v
salmon 1.6.0

2) R in anaconda/bin/R has tidyverse: https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L262

$ which R
/n/app/bcbio/1.2.9/anaconda/bin/R
$ R

R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-conda-linux-gnu (64-bit)

.> library(tidyverse)
── tidyverse 1.3.1 ──
✔ ggplot2 3.3.5     ✔ purrr   0.3.4
✔ tibble  3.1.6     ✔ dplyr   1.0.7
✔ tidyr   1.1.4     ✔ stringr 1.4.0
✔ readr   2.1.1     ✔ forcats 0.5.1

No sure, what exactly happens in your installation, I could only suspect incomplete installation/mixed installations..

Sergey

kokyriakidis commented 2 years ago

I do not know what is happening. I can only install it using sudo. Otherwise it gets stuck during the resolve phase. I have to other conda installations in the system.

Installing it using sudo installed the correct package versions.

Thanks again for looking into it.

naumenko-sa commented 2 years ago

ok, maybe it is system specific. Have you included --mamba during the installation (helps with solving)?

kokyriakidis commented 2 years ago

Yes, I have tried both conda and mamba.