hbc / bcbioRNASeq

R package for bcbio RNA-seq analysis.
https://bioinformatics.sph.harvard.edu/bcbioRNASeq
GNU Affero General Public License v3.0
58 stars 21 forks source link

Missing conda packages #175

Closed kokyriakidis closed 3 years ago

kokyriakidis commented 3 years ago

Hi @mjsteinbaugh !

I cannot install bcbioRNASeq correctly in a new conda environment. Some packages are not installed at all and cause an ERROR in other packages.

I have to manually install all these missing packages in order to install r-bcbiornaseq correctly: r-deseqanalysis (could not load library, it wasn't installed) r-curl (without it httr, AnnotationHub etc got an error during install) boost (needed for rsqlite) r-rsqlite apeglm (optional) rtracklayer (could not import bcbio file after fresh install of bcbiornaseq) ggrepel pheatmap

Is there a way to add them to the dependency list?

mjsteinbaugh commented 3 years ago

@kokyriakidis thanks for the input, we can definitely update the recipe to add these

mjsteinbaugh commented 3 years ago

OK so the only conda package that should be added appears to be bioconductor-apeglm, which I'll add to the r-deseqanalysis recipe, which is a r-bcbiornaseq dependency.

How are you managing your conda environment? Are you trying to upgrade and/or install packages in a current recipe? That often runs into issues.

Here's what I recommend:

name='r-bcbiornaseq'
version='0.3.41'
conda create --name="${name}@${version}" "${name}==${version}"
conda activate "${name}@${version}"
R
library(bcbioRNASeq)

Here's how to check the current recipes:

conda search --info r-basejump
conda search --info r-deseqanalysis
conda search --info r-bcbiornaseq
mjsteinbaugh commented 3 years ago

I'm double checking the dependency recipes to see if there's a fix needed.

mjsteinbaugh commented 3 years ago

It appears that there's an a recipe issue with r-cli==3.0.1. I'm filing an issue with conda-forge.

conda search --info r-cli==3.0.1
r-cli 3.0.1 r41hc72bb7e_0
-------------------------
file name   : r-cli-3.0.1-r41hc72bb7e_0.tar.bz2
name        : r-cli
version     : 3.0.1
build       : r41hc72bb7e_0
build number: 0
size        : 704 KB
license     : MIT
subdir      : noarch
url         : https://conda.anaconda.org/conda-forge/noarch/r-cli-3.0.1-r41hc72bb7e_0.tar.bz2
md5         : e76c1aa510651316b8cc23adff1fd3f5
timestamp   : 2021-07-17 11:04:14 UTC
dependencies:
  - r-assertthat
  - r-base >=4.1,<4.2.0a0
  - r-crayon >=1.3.4
  - r-fansi
  - r-glue
conda activate r-cli@3.0.1
R
library(cli)
## Error: package or namespace load failed for ‘cli’ in library.dynam(lib, package, package.lib):
##  shared object ‘cli.dylib’ not found
mjsteinbaugh commented 3 years ago

I'm also seeing this issue with r-cli==3.0.0, but r-cli==2.5.0 works as expected.

mjsteinbaugh commented 3 years ago

See related https://github.com/conda-forge/r-cli-feedstock/issues/20 , which is a known issue on macOS.

kokyriakidis commented 3 years ago

I do not know what is happening but tried to install it again and everything installed correctly except apeglm.

BUT now I cannot create a DESeqAnalysis object! I get an error:

deseq <- DESeqAnalysis(
    data = dds,
    transform = dt,
    results = res_list_unshrunken,
    lfcShrink = res_list_shrunken
)
Error in validObject(.Object) : invalid class “DESeqAnalysis” object: [2] isSubset(x = c("geneID", "geneName"), y = names(mcols(rowRanges(data)))) is not TRUE. 'c("geneID", "geneName")' has elements not in 'names(mcols(rowRanges(data)))': geneID [3] isSubset(x = c("geneID", "geneName"), y = names(mcols(rowRanges(transform)))) is not TRUE. 'c("geneID", "geneName")' has elements not in 'names(mcols(rowRanges(transform)))': geneID If supported, 'updateObject()' may help resolve these issues.
mjsteinbaugh commented 3 years ago

What does your sessionInfo() look like? Seems like that could be an old version because it's checking for "geneID" rather than "geneId".

kokyriakidis commented 3 years ago

The new installation installed DESeqAnalysis 0.3.10 a really old version! I manually installed the new version 0.4.2

kokyriakidis commented 3 years ago

Everything works now except the Volcano labels.

This is the session info after I manually installed DESeqAnalysis and Apeglm

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS/LAPACK: /media/kokyriakidis/RED/BCBIO/RESOURCES/bcbio/anaconda/envs/r-bcbiornaseq@0.3.41/lib/libopenblasp-r0.3.15.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=el_GR.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=el_GR.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=el_GR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=el_GR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeqAnalysis_0.4.2         DESeq2_1.32.0               SummarizedExperiment_1.22.0 Biobase_2.52.0             
 [5] MatrixGenerics_1.4.0        matrixStats_0.60.0          GenomicRanges_1.44.0        GenomeInfoDb_1.28.0        
 [9] IRanges_2.26.0              S4Vectors_0.30.0            BiocGenerics_0.38.0         goalie_0.5.2               
[13] bcbioRNASeq_0.3.41          basejump_0.14.19           

loaded via a namespace (and not attached):
  [1] colorspace_2.0-2            rjson_0.2.20                ellipsis_0.3.2              XVector_0.32.0             
  [5] AcidPlots_0.3.7             rstudioapi_0.13             farver_2.1.0                ggrepel_0.9.1              
  [9] bit64_4.0.5                 mvtnorm_1.1-2               AnnotationDbi_1.54.0        fansi_0.4.2                
 [13] apeglm_1.14.0               splines_4.1.0               tximport_1.20.0             cachem_1.0.5               
 [17] geneplotter_1.70.0          knitr_1.33                  AcidExperiment_0.1.12       jsonlite_1.7.2             
 [21] Rsamtools_2.8.0             annotate_1.70.0             png_0.1-7                   pheatmap_1.0.12            
 [25] BiocManager_1.30.16         readr_1.4.0                 compiler_4.1.0              httr_1.4.2                 
 [29] assertthat_0.2.1            Matrix_1.3-4                fastmap_1.1.0               limma_3.48.0               
 [33] cli_3.0.1                   htmltools_0.5.1.1           tools_4.1.0                 coda_0.19-4                
 [37] gtable_0.3.0                glue_1.4.2                  GenomeInfoDbData_1.2.6      dplyr_1.0.7                
 [41] Rcpp_1.0.7                  bbmle_1.0.23.1              vctrs_0.3.8                 AcidGenomes_0.2.14         
 [45] Biostrings_2.60.0           AcidGenerics_0.5.18         rtracklayer_1.52.0          xfun_0.24                  
 [49] stringr_1.4.0               syntactic_0.4.5             lifecycle_1.0.0             restfulr_0.0.13            
 [53] XML_3.99-0.6                AcidCLI_0.1.2               edgeR_3.34.0                zlibbioc_1.38.0            
 [57] MASS_7.3-54                 scales_1.1.1                hms_1.1.0                   RColorBrewer_1.1-2         
 [61] SingleCellExperiment_1.14.1 yaml_2.2.1                  AcidSingleCell_0.1.7        memoise_2.0.0              
 [65] ggplot2_3.3.5               emdbook_1.3.12              bdsmatrix_1.3-4             stringi_1.7.3              
 [69] RSQLite_2.2.5               highr_0.9                   genefilter_1.74.0           BiocIO_1.2.0               
 [73] BiocParallel_1.26.0         rlang_0.4.11                pkgconfig_2.0.3             bitops_1.0-7               
 [77] evaluate_0.14               lattice_0.20-44             purrr_0.3.4                 labeling_0.4.2             
 [81] GenomicAlignments_1.28.0    cowplot_1.1.1               bit_4.0.4                   tidyselect_1.1.1           
 [85] plyr_1.8.6                  magrittr_2.0.1              AcidPlyr_0.1.20             R6_2.5.0                   
 [89] generics_0.1.0              DelayedArray_0.18.0         DBI_1.1.1                   pillar_1.6.1               
 [93] withr_2.4.2                 survival_3.2-11             KEGGREST_1.32.0             RCurl_1.98-1.3             
 [97] tibble_3.1.3                pipette_0.6.2               bcbioBase_0.6.21            crayon_1.4.1               
[101] utf8_1.2.2                  rmarkdown_2.9               locfit_1.5-9.4              grid_4.1.0                 
[105] data.table_1.14.0           blob_1.2.2                  digest_0.6.27               xtable_1.8-4               
[109] AcidBase_0.3.14             numDeriv_2016.8-1.1         munsell_0.5.0               AcidMarkdown_0.1.2         
[113] sessioninfo_1.1.1   
kokyriakidis commented 3 years ago

Running

options(
    "repos" = c(
        "CRAN" = "https://cloud.r-project.org",
        "AcidGenomics" = "https://r.acidgenomics.com"
    )
)
BiocManager::valid()

I get now:

Bioconductor version '3.13'

  * 20 packages out-of-date
  * 0 packages too new

create a valid installation with

  BiocManager::install(c(
    "AnnotationDbi", "AnnotationHub", "BiocParallel", "BiocStyle", "biomaRt", "Biostrings", "broom", "clusterProfiler", "DOSE",
    "DropletUtils", "enrichplot", "ensembldb", "fansi", "GenomeInfoDb", "ggtree", "googlesheets4", "limma", "readr", "Rhdf5lib",
    "RSQLite"
  ), update = TRUE, ask = FALSE)

more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date

Warning message:
20 packages out-of-date; 0 packages too new 
mjsteinbaugh commented 3 years ago

@kokyriakidis This is running inside of conda correct? It's not a good idea to update the R packages manually inside a conda environment. I'm pushing a fix for volcano plot labels in DESeqAnalysis, which should be online shortly.

kokyriakidis commented 3 years ago

Yes I activated the r-bcbiornaseq@0.3.41 environment and run R from there. Then I updated DESeqAnalysis from R. Should I just update DESeqAnalysis from conda? Do you have any clue why it installed a really old version?

mjsteinbaugh commented 3 years ago

Basically when running R inside of conda, you shouldn't update anything manually. The recipe should install the correct dependency versions -- since it's allowing an older version of DESeqAnalysis, I need to tighten up the current recipe apparently, which is defined here in the bioconda-recipes repo. Conda is a bit tricky to get working perfectly with R packages.

kokyriakidis commented 3 years ago

OK thanks for the info!

When you fix the recipe version, add apeglm dependency and have plot labels fixed send me a message to install again bcbioRNASeq and try to run the template again.

mjsteinbaugh commented 3 years ago

Will do, I'm working on that today.

kokyriakidis commented 3 years ago

Can you also add AcidGSEA because it is missing? And check the Ensembl2Entrez error.

Thanks again for your help!

mjsteinbaugh commented 3 years ago

OK this conda recipe update should hopefully fix the issues.

kokyriakidis commented 3 years ago

Thanks!

kokyriakidis commented 3 years ago

Everything works perfectly now. All issues have been resolved and every template runs smoothly.

Thank you so much for your support!