Bioconductor / basilisk

Clone of the Bioconductor repository for the basilisk package.
https://bioconductor.org/packages/devel/bioc/html/basilisk.html
GNU General Public License v3.0
27 stars 14 forks source link

`setupBasiliskEnv`: Allowing >/< #17

Closed bschilder closed 2 years ago

bschilder commented 2 years ago

It doesn't look like there's any checks in basilisk::setupBasiliskEnv to handle using > or < when specifying versions. I know using = or == is encouraged, but in some situations it may be necessary to use the more flexible >= syntax (e.g. when you don't know the specific versions available for every package on every platform).

Currently, when you do input packages with the >= syntax, setupBasiliskEnv continues but doesn't parse the inputs correctly. https://github.com/LTLA/basilisk/blob/3778f27684497ea99136657a5f7ccd504aa581ed/R/setupBasiliskEnv.R#L71

This results in weird situations where packages can't be found, and/or multiple conflicting versions of python are requested to be installed at the same time (e.g. python>=3.9 and python==3.7.7).

Potential solutions:

  1. Edit setupBasiliskEnv so it correctly parses package versions specified with </>. If developers want, a warning can be displayed to let users know that the "=" syntax is preferred. 👈 (Option 1 would be my preference).
  2. Add a check to detect any </> usage, and if found issue an error that says this syntax is disallowed.

Reprex

packages <- c('axel>=2.17.11','bitarray>=2.4.0','fastparquet>=0.8.0','gzip>=1.11','htslib>=1.15','networkx>=2.7.1','pandas-plink>=2.2.9','pip>=22.0.4','pyarrow>=7.0.0','requests>=2.27.1','rpy2>=3.4.5','scikit-learn>=1.0.2','scipy>=1.8.0','tabix>=1.11','tqdm>=4.63.0','wget>=1.20.3',`python>=3.9`)

 envObj <- basilisk::BasiliskEnvironment(
        envname = "test", 
        pkgname = "echoconda",
        ## Use >= rather than = or == for increased flexibility,
        ## but reduced reproducibility.
        packages = packages, 
        channels = c("conda-forge","bioconda","nodefaults" )
    ) 
proc <- basilisk::basiliskStart(env = envObj)

Session info

``` R version 4.2.0 (2022-04-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3.1 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] echoconda_0.99.6 loaded via a namespace (and not attached): [1] reticulate_1.25 tidyselect_1.1.2 xfun_0.31 purrr_0.3.4 [5] lattice_0.20-45 basilisk.utils_1.8.0 vctrs_0.4.1 generics_0.1.2 [9] testthat_3.1.4 htmltools_0.5.2 yaml_2.3.5 utf8_1.2.2 [13] rlang_1.0.2 R.oo_1.24.0 pillar_1.7.0 glue_1.6.2 [17] withr_2.5.0 DBI_1.1.2 R.utils_2.11.0 lifecycle_1.0.1 [21] stringr_1.4.0 R.methodsS3_1.8.1 evaluate_0.15 knitr_1.39 [25] fastmap_1.1.0 parallel_4.2.0 fansi_1.0.3 Rcpp_1.0.8.3 [29] filelock_1.0.2 desc_1.4.1 pkgload_1.2.4 jsonlite_1.8.0 [33] brio_1.1.3 basilisk_1.8.0 dir.expiry_1.4.0 png_0.1-7 [37] digest_0.6.29 stringi_1.7.6 dplyr_1.0.9 grid_4.2.0 [41] rprojroot_2.0.3 cli_3.3.0 tools_4.2.0 magrittr_2.0.3 [45] tibble_3.1.7 crayon_1.5.1 pkgconfig_2.0.3 ellipsis_0.3.2 [49] Matrix_1.4-1 data.table_1.14.2 assertthat_0.2.1 rmarkdown_2.14 [53] R6_2.5.1 compiler_4.2.0 ```
LTLA commented 2 years ago

After some contemplation, I'm afraid to say I had to go with Option 2. Mostly because the entire point of basilisk is to freeze versions and allowing >= specifications would not be consistent with the spirit of the package.

Some random points:

bschilder commented 2 years ago

Thanks for the quick reply @LTLA. I think this approach is quite understandable given the goals on basilisk.

I don't know the versions necessary for each platform, so what I think I'll need to do is spin up virtual machines for each platform, create the env in each of them, export separate yaml files, and then store those yaml files so they can be used by basilisk depending on the platform it's being run on.

If I end up writing any functions that can do some or all of these steps I'll be sure to share.