saezlab / liana

LIANA: a LIgand-receptor ANalysis frAmework
https://saezlab.github.io/liana/
GNU General Public License v3.0
169 stars 30 forks source link

Basilisk fails to install #85

Closed rbutleriii closed 1 year ago

rbutleriii commented 1 year ago

gets the following error:

Setting up Conda Environment with Basilisk
Error in basilisk::basiliskStart(liana_env, testload = "scipy.optimize") : 
  unused argument (testload = "scipy.optimize")
Calls: liana_tensor_c2c
R version 4.0.0 (2020-04-24)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] ExperimentHub_1.16.1        AnnotationHub_2.22.1
 [3] BiocFileCache_1.14.0        dbplyr_2.1.1
 [5] magrittr_2.0.3              reticulate_1.25
 [7] SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0
 [9] Biobase_2.50.0              GenomicRanges_1.42.0
[11] GenomeInfoDb_1.26.7         IRanges_2.24.1
[13] S4Vectors_0.28.1            BiocGenerics_0.36.1
[15] MatrixGenerics_1.2.1        matrixStats_0.62.0
[17] forcats_0.5.1               stringr_1.4.0
[19] dplyr_1.0.9                 purrr_0.3.4
[21] readr_2.0.1                 tidyr_1.2.0
[23] tibble_3.1.7                ggplot2_3.3.6
[25] tidyverse_1.3.1             patchwork_1.1.1
[27] sp_1.5-0                    SeuratObject_4.1.0
[29] Seurat_4.1.1                data.table_1.14.2
[31] liana_0.1.10

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                    tidyselect_1.1.2
  [3] AnnotationDbi_1.52.0          RSQLite_2.2.9
  [5] htmlwidgets_1.5.4             grid_4.0.0
  [7] BiocParallel_1.28.3           Rtsne_0.16
  [9] munsell_0.5.0                 codetools_0.2-18
 [11] ica_1.0-3                     statmod_1.4.36
 [13] scran_1.18.7                  future_1.26.1
 [15] miniUI_0.1.1.1                withr_2.5.0
 [17] spatstat.random_2.2-0         colorspace_2.0-3
 [19] progressr_0.10.1              filelock_1.0.2
 [21] logger_0.2.1                  knitr_1.37
 [23] rstudioapi_0.13               ROCR_1.0-11
 [25] tensor_1.5                    listenv_0.8.0
 [27] GenomeInfoDbData_1.2.4        polyclip_1.10-0
 [29] bit64_4.0.5                   basilisk_1.2.1
 [31] parallelly_1.32.0             vctrs_0.4.1
 [33] generics_0.1.3                xfun_0.29
 [35] R6_2.5.1                      clue_0.3-59
 [37] rsvd_1.0.5                    locfit_1.5-9.4
 [39] cachem_1.0.6                  bitops_1.0-7
 [41] spatstat.utils_2.3-1          DelayedArray_0.16.3
 [43] assertthat_0.2.1              promises_1.2.0.1
 [45] scales_1.2.0                  rgeos_0.5-9
 [47] gtable_0.3.0                  beachmat_2.6.4
 [49] Cairo_1.5-12.2                globals_0.15.1
 [51] goftest_1.2-3                 rlang_1.0.3
 [53] GlobalOptions_0.1.2           splines_4.0.0
 [55] lazyeval_0.2.2                broom_0.7.9
 [57] spatstat.geom_2.4-0           checkmate_2.0.0
 [59] BiocManager_1.30.16           modelr_0.1.8
 [61] yaml_2.3.5                    reshape2_1.4.4
 [63] abind_1.4-5                   backports_1.4.1
 [65] httpuv_1.6.5                  tools_4.0.0
 [67] ellipsis_0.3.2                spatstat.core_2.4-4
 [69] RColorBrewer_1.1-3            ggridges_0.5.3
 [71] Rcpp_1.0.9                    plyr_1.8.7
 [73] sparseMatrixStats_1.2.1       progress_1.2.2
 [75] zlibbioc_1.36.0               RCurl_1.98-1.5
 [77] basilisk.utils_1.2.2          prettyunits_1.1.1
 [79] rpart_4.1-15                  deldir_1.0-6
 [81] pbapply_1.5-0                 GetoptLong_1.0.5
 [83] cowplot_1.1.1                 zoo_1.8-10
 [85] haven_2.3.1                   ggrepel_0.9.1
 [87] cluster_2.1.2                 fs_1.5.2
 [89] scattermore_0.8               circlize_0.4.13
 [91] reprex_2.0.1                  lmtest_0.9-40
 [93] RANN_2.6.1                    fitdistrplus_1.1-8
 [95] hms_1.1.1                     mime_0.12
 [97] evaluate_0.14                 xtable_1.8-4
 [99] readxl_1.3.1                  gridExtra_2.3
[101] shape_1.4.6                   compiler_4.0.0
[103] KernSmooth_2.23-20            crayon_1.5.1
[105] htmltools_0.5.2               mgcv_1.8-36
[107] later_1.3.0                   tzdb_0.1.2
[109] lubridate_1.7.10              DBI_1.1.2
[111] ComplexHeatmap_2.6.2          MASS_7.3-54
[113] rappdirs_0.3.3                Matrix_1.3-4
[115] cli_3.3.0                     igraph_1.3.2
[117] pkgconfig_2.0.3               OmnipathR_3.7.0
[119] plotly_4.10.0                 scuttle_1.0.4
[121] spatstat.sparse_2.1-1         xml2_1.3.3
[123] dqrng_0.3.0                   XVector_0.30.0
[125] rvest_1.0.1                   digest_0.6.29
[127] sctransform_0.3.3             RcppAnnoy_0.0.19
[129] spatstat.data_2.2-0           rmarkdown_2.11
[131] cellranger_1.1.0              leiden_0.4.2
[133] uwot_0.1.11                   edgeR_3.32.1
[135] DelayedMatrixStats_1.12.3     curl_4.3.2
[137] shiny_1.7.1                   rjson_0.2.21
[139] nlme_3.1-153                  lifecycle_1.0.1
[141] jsonlite_1.8.0                BiocNeighbors_1.8.2
[143] viridisLite_0.4.0             limma_3.46.0
[145] fansi_1.0.3                   pillar_1.7.0
[147] lattice_0.20-44               fastmap_1.1.0
[149] httr_1.4.3                    survival_3.2-13
[151] interactiveDisplayBase_1.28.0 glue_1.6.2
[153] png_0.1-7                     BiocVersion_3.12.0
[155] bit_4.0.4                     bluster_1.0.0
[157] stringi_1.7.6                 blob_1.2.2
[159] BiocSingular_1.6.0            memoise_2.0.1
[161] irlba_2.3.5                   future.apply_1.9.0
dbdimitrov commented 1 year ago

I believe this is a versioning issue.

Can you re-install basilisk.utils and basilisk from github and let me know if that works?

rbutleriii commented 1 year ago

With the github versions of both I now get runtime errors. Doesn't really give me anything to go on though.

> sce <- liana_tensor_c2c(sce = sce,
+                         score_col = 'LRscore',
+                         rank = 7,  # set to None to estimate for you data!
+                         how='outer',  #  defines how the tensor is built
+                         conda_env = NULL, # used to pass an existing conda env with cell2cell
+                         use_available = FALSE # detect & load cell2cell if available
+                         )
Setting up Conda Environment with Basilisk
Error : RuntimeError: The current Numpy installation ('/home/rrbutler/.cache/R/basilisk/1.11.2/liana/0.1.10/liana_cell2cell/lib/python3.8/site-packages/numpy/__init__.py') fails to pass simple sanity checks. This can be caused for example by incorrect BLAS library being linked in, or by mixing package managers (pip, conda, apt, ...). Search closed numpy issues for similar problems.

In addition: Warning message:
In exec(output, ...) : `sce` was superseded by `context_df_dict`!
Error in .activate_fallback(proc, testload, env = env, envpath = envpath) :
  RuntimeError: The current Numpy installation ('/home/rrbutler/.cache/R/basilisk/1.11.2/liana/0.1.10/liana_cell2cell/lib/python3.8/site-packages/numpy/__init__.py') fails to pass simple sanity checks. This can be caused for example by incorrect BLAS library being linked in, or by mixing package managers (pip, conda, apt, ...). Search closed numpy issues for similar problems.
> traceback()
4: stop(msg)
3: .activate_fallback(proc, testload, env = env, envpath = envpath)
2: basilisk::basiliskStart(liana_env, testload = "scipy.optimize")
1: liana_tensor_c2c(sce = sce, score_col = "LRscore", rank = 7,
       how = "outer", conda_env = NULL, use_available = FALSE)

I have a hunch this is a cluster issue (on an academic cluster with managed packages), as when I back out and examine the Lmod setup, it appears R has its own version of python that probably is missing some libraries, but loading anaconda as a module first it breaks the R. Is there a conda yml to build the environment for c2c so I can just link it?

rbutleriii commented 1 year ago

Things I have tried with basilisk:

Trying to build the conda env separately:

library(tidyverse, quietly = TRUE) library(SingleCellExperiment, quietly = TRUE) library(reticulate, quietly = TRUE) library(magrittr, quietly = TRUE) library(liana, quietly = TRUE) sce = readRDS("sce_temp.rds")

sce = liana_tensor_c2c(

  • sce=sce,
  • score_col='LRscore',
  • rank=7, # set to None to estimate for you data!
  • how='outer', # defines how the tensor is built
  • conda_env="liana_env", # used to pass an existing conda env with cell2cell
  • use_available=FALSE # detect & load cell2cell if available
  • ) [1] 0 Loading liana_env Conda Environment Error: ImportError: /lib64/libz.so.1: version ZLIB_1.2.9' not found (required by /home/rrbutler/.conda/envs/liana_env/lib/python3.8/site-packages/matplotlib/../../.././libpng16.so.16) In addition: Warning message: In exec(output, ...) :scewas superseded bycontext_df_dict`!
  • Lastly, build out the liana_env completely with r-base r-essentials r-tidyverse bioconductor-singlecellexperiment r-reticulate r-magrittr to be completely freestanding. At that point anaconda quit and started crying in the corner (was not able to successfully build, just failed to solve the environment for an hour).
dbdimitrov commented 1 year ago

Hmmm, actually you don't necessarily need to set up an R environment (basilisk sets up a pure Python one in this case).

Instead, what you could do is to set up a conda env /w the same parameters that liana needs to run tensor (in Python).

These are the package versions that are currently used to create the conda env via basilis: https://github.com/saezlab/liana/blob/10d81773e0874de676eb106ce56e3cf9d4fe01d3/R/liana_tensor.R#L714

Would suggest creating a conda env with those and passing that one :)

rbutleriii commented 1 year ago

Updates: I did install that liana_env by opening the pandora's box of making all dependencies >=. However, there is no way to install liana via remotes in a conda version of r. In any case, that was more of a tangent, as i could build the tensor environment.

But using python 3.8.8 was giving me issues with the pip installs, seemingly from this pypa issue that was not backported. So setting it to >=3.9 got me moving...right back to the original basilisk error.

> sce = liana_tensor_c2c(
+   sce=sce,
+   score_col='LRscore',
+   rank=7,  # set to None to estimate for you data!
+   how='outer',  #  defines how the tensor is built
+   conda_env="cell2cell", # used to pass an existing conda env with cell2cell
+   use_available=TRUE # detect & load cell2cell if available
+ )
[1] 0
Loading `cell2cell` Conda Environment
Error: RuntimeError: The current Numpy installation ('/home/rrbutler/miniconda3/envs/cell2cell/lib/python3.10/site-packages/numpy/__init__.py') fails to pass simple sanity checks. This can be caused for example by incorrect BLAS library being linked in, or by mixing package managers (pip, conda, apt, ...). Search closed numpy issues for similar problems.
In addition: Warning message:
In exec(output, ...) : `sce` was superseded by `context_df_dict`!

It appears reticulate set it up right after all, but that there is another issue with numpy via reticulate in R when depending on anaconda versions of numpy. Despite it importing successfully in python outside of reticulate. As in that suggested solution, you can force a reinstallation of numpy (but it should be <1.24 or it conflicts with numba).

reticulate::py_install("numpy<1.24", envname="cell2cell", pip=T, pip_options=c("--force-reinstall", "--no-binary numpy"))

This accepts numpy, but errors out soon after with a python error, that can be replicated in the conda envrionment:

>>> import cell2cell
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rrbutler/.local/lib/python3.10/site-packages/cell2cell/__init__.py", line 3, in <module>
    from cell2cell import analysis
  File "/home/rrbutler/.local/lib/python3.10/site-packages/cell2cell/analysis/__init__.py", line 1, in <m                                                     odule>
    from cell2cell.analysis.cell2cell_pipelines import (initialize_interaction_space, BulkInteractions, S                                                     ingleCellInteractions)
  File "/home/rrbutler/.local/lib/python3.10/site-packages/cell2cell/analysis/cell2cell_pipelines.py", li                                                     ne 6, in <module>
    import scanpy
  File "/home/rrbutler/.local/lib/python3.10/site-packages/scanpy/__init__.py", line 16, in <module>
    from . import plotting as pl
  File "/home/rrbutler/.local/lib/python3.10/site-packages/scanpy/plotting/__init__.py", line 1, in <modu                                                     le>
    from ._anndata import (
  File "/home/rrbutler/.local/lib/python3.10/site-packages/scanpy/plotting/_anndata.py", line 28, in <mod                                                     ule>
    from . import _utils
  File "/home/rrbutler/.local/lib/python3.10/site-packages/scanpy/plotting/_utils.py", line 35, in <modul                                                     e>
    class _AxesSubplot(Axes, axes.SubplotBase, ABC):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the me                                                     taclasses of all its bases

Ultimately this has something to do with some combination of the way pip operates in conda when called up through a yml file or the reticulate method. The final solution was to just build a stock cell2cell conda environment exactly as explained in the cell2cell pypi (or their tutorial). This builds, and can be passed into R (even an Lmod installation) by specifying the correct reticulate python location:

Sys.setenv(RETICULATE_PYTHON = "~/miniconda3/envs/cell2cell/bin/python")

library(tidyverse, quietly = TRUE)
library(SingleCellExperiment, quietly = TRUE)
library(reticulate, quietly = TRUE)
library(magrittr, quietly = TRUE)
library(liana, quietly = TRUE)
# library(ExperimentHub, quietly = TRUE)

sce = readRDS("sce_temp.rds")

sce = liana_tensor_c2c(
  sce=sce,
  score_col='LRscore',
  rank=7,  # set to None to estimate for you data!
  how='outer',  #  defines how the tensor is built
  conda_env="cell2cell", # used to pass an existing conda env with cell2cell
  use_available=TRUE # detect & load cell2cell if available
)
rbutleriii commented 1 year ago

Oh, I guess as a prologue, that means I never actually got basilisk to build the environment for me. Something about the location of pip via a conda yml, and just calling pip install inside the conda env.

dbdimitrov commented 1 year ago

Hi @rbutleriii,

Oh okay. Maybe in your case I would suggest checking liana-py x Tensor fully in Python, this would resolve all of the issues that you are having with reticulate. https://liana-py.readthedocs.io/en/latest/notebooks/liana_c2c.html

Though this line might be what is breaking it (reticulate is a funny package from my experience): Sys.setenv(RETICULATE_PYTHON = "~/miniconda3/envs/cell2cell/bin/python")