rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.68k stars 327 forks source link

Error in normalizePath #1460

Open igordot opened 1 year ago

igordot commented 1 year ago

After installing reticulate, I see the following error when R launches (curiously, in RStudio, but not command-line R):

Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/path/miniconda3/bin/mamba": No such file or directory
Calls: do.call ... python_munge_path -> get_python_conda_info -> normalizePath
Execution halted

I also see a similar error with various reticulate commands:

> reticulate::py_config()
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/path/miniconda3/bin/mamba": No such file or directory

In case it helps, other conda-related functions seem to work:

> reticulate:::conda_binary()
[1] "/path/miniconda3/bin/conda"

This sounds very similar to previous issues https://github.com/rstudio/reticulate/issues/1176 and https://github.com/rstudio/reticulate/pull/1375 but those were resolved months ago. I am using the latest CRAN version 1.31.

t-kalinowski commented 1 year ago

Thanks for reporting!

To track this down we'll need to figure out what's wrong.

rlang::global_entrace()
reticulate::py_discover_config()
rlang::last_error()
igordot commented 1 year ago

I do not have mamba installed. I had it at some point, but uninstalled it at some point (I switched to the new libmamba dependency solver).

These are the commands you asked for:

> rlang::global_entrace()
pushing duplicate `error` handler on top of the stack
pushing duplicate `warning` handler on top of the stack
pushing duplicate `message` handler on top of the stack
> reticulate::py_discover_config()
Error in `normalizePath()`:
! path[1]="/path/miniconda3/bin/mamba": No such file or directory
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_error()
<error/rlang_error>
Error in `normalizePath()`:
! path[1]="/path/miniconda3/bin/mamba": No such file or directory
---
Backtrace:
    ▆
 1. └─reticulate::py_discover_config()
 2.   └─reticulate:::python_config(python, required_module)
 3.     └─reticulate:::python_munge_path(python)
 4.       └─reticulate:::get_python_conda_info(python)
 5.         └─base::normalizePath(conda, winslash = "/", mustWork = TRUE)

And session info:

R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Ventura 13.5.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.11       lattice_0.21-8    png_0.1-8         fansi_1.0.4       utf8_1.2.3       
 [6] withr_2.5.0       grid_4.2.2        lifecycle_1.0.3   jsonlite_1.8.7    pillar_1.9.0     
[11] rlang_1.1.1       cli_3.6.1         rstudioapi_0.15.0 Matrix_1.5-4.1    vctrs_0.6.3      
[16] reticulate_1.31   tools_4.2.2       glue_1.6.2        compiler_4.2.2   
t-kalinowski commented 1 year ago

I do not have mamba installed. I had it at some point, but uninstalled it at some point (I switched to the new libmamba dependency solver).

It would appear that mamba is not completely uninstalled - there is at least the symlink at /path/miniconda3/bin/mamba remaining. That error from normalizePath() is what you would see for a symlink pointing to a file that no longer exists.

Can you try removing it and see if that fixes the issue? unlink("/path/miniconda3/bin/mamba", force = TRUE)

t-kalinowski commented 1 year ago

I think we make reticulate::py_discover_config() more robust for this type of situation. If you recall, can you please share the steps you took to install mamba, and then uninstall it, so that I can reproduce the error locally?

igordot commented 1 year ago

/path/miniconda3/bin/mamba does not exist (either symlink or real file), so cannot be removed.

> reticulate::py_discover_config()
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/path/miniconda3/bin/mamba": No such file or directory

I am not sure how I installed mamba. I believe it was conda install mamba -n base -c conda-forge. I removed with conda remove mamba.

t-kalinowski commented 1 year ago

And how was conda installed? The /path/ prefix is new to me.

igordot commented 1 year ago

Sorry for the confusion. I was just simplifying my home directory (/Users/username/).

I don't remember the exact installation. It was with the Miniconda3 installer.

t-kalinowski commented 1 year ago

I tried but am unable to reproduce the error. Would you be able to step through py_discover_config() and see where/how the error occurs? debug(reticulate::py_discover_config)?

igordot commented 1 year ago

I did not know about the debug() feature, so may not be using it to its full potential.

I got to line 105. In case the numbering is different, here are lines 103-106:

  python <- tryCatch(py_resolve("r-reticulate"), error = identity)
  if (!inherits(python, "error")) 
    return(python_config(python, required_module))
  create_default_virtualenv(package = "reticulate")

Running in code browser:

Browse[3]>   if (!is.na(reticulate_env)) {
+     python_version <- normalize_python_path(reticulate_env)
+     if (!python_version$exists) 
+       stop("Python specified in RETICULATE_PYTHON_FALLBACK (", 
+         reticulate_env, ") does not exist")
+     python_version <- python_version$path
+     config <- python_config(python_version, required_module, 
+       python_version, forced = "RETICULATE_PYTHON_FALLBACK")
+     return(config)
+   }
Browse[3]>   python <- tryCatch(py_resolve("r-reticulate"), error = identity)
Browse[3]>   if (!inherits(python, "error")) 
+     return(python_config(python, required_module))
debug at #2: return(python_config(python, required_module))

Modifying the code to generate output:

Browse[3]> tryCatch(py_resolve("r-reticulate"), error = identity)
[1] "/Users/id/miniconda3/envs/r-reticulate/bin/python"
Browse[3]> !inherits(python, "error")
[1] TRUE
Browse[3]> python_config(python, required_module)
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/Users/id/miniconda3/bin/mamba": No such file or directory

So the error is from python_config():

> reticulate:::python_config("/Users/id/miniconda3/envs/r-reticulate/bin/python", NULL)
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/Users/id/miniconda3/bin/mamba": No such file or directory

Then python_munge_path():

> reticulate:::python_munge_path("/Users/id/miniconda3/envs/r-reticulate/bin/python")
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/Users/id/miniconda3/bin/mamba": No such file or directory

Then get_python_conda_info():

> reticulate:::get_python_conda_info("/Users/id/miniconda3/envs/r-reticulate/bin/python")
Error in normalizePath(conda, winslash = "/", mustWork = TRUE) : 
  path[1]="/Users/id/miniconda3/bin/mamba": No such file or directory

Then debugging get_python_conda_info():

Browse[2]>   stopifnot(is_conda_python(python))
Browse[2]>   root <- if (is_windows()) 
+     dirname(python)
Browse[2]>   else dirname(dirname(python))
Error: unexpected 'else' in "  else"

This is the code I see:

  root <- if (is_windows()) 
    dirname(python)
  else dirname(dirname(python))
t-kalinowski commented 1 year ago

Thanks for the detailed investigation!

It looks like get_python_conda_info() calls python_info_condenv_find() to resolve the appropriate conda binary to use with that conda env. It reads the conda-meta/history file associated with that conda environment, and the environment command history file indicates it was created with mamba, and the appropriate binary to use is mamba. Subsequently, it appears you uninstalled mamba, hence the cryptic error from reticulate when it attempts to invoke the mamba executable.

You can confirm the above is correct by checking:

readLines("~/miniconda3/envs/r-reticulate/conda-meta/history") 

and noting a line like

# cmd: /Users/id/miniconda3/bin/mamba create --yes --name r-reticulate python=3.8 numpy --quiet -c conda-forge

My recomendation is to recreate the r-reticulate conda env, or transition to using virtual environments.

conda_remove("r-reticulate")
conda_create("r-reticulate")
virtualenv_create("r-reticulate", force = TRUE)
igordot commented 1 year ago

It reads the conda-meta/history file associated with that conda environment, and the environment command history file indicates it was created with mamba, and the appropriate binary to use is mamba

That indeed seems to be the case. Recreating the environment resolved the problem.

Maybe there should be a check if the binary in history still exists.

Why not just check if mamba/conda binaries exist rather than going through the history file? I imagine it would be a problem as well if you decide to switch to mamba, but you created the environment beforehand.

t-kalinowski commented 1 year ago

It's common for users to have multiple conda installations on their system, and also, for the relative locations of the conda executable and conda environments to be customized. Parsing the history file has been to-date the most robust way to find the correct conda executable for a particular conda environment.

mtekman commented 1 year ago

For anyone using a biocontainer (or some kind of Docker container built from a conda environment, but does not actually contain conda, but still has residual conda files, e.g. /usr/local/conda-meta/history):

You can get around it by masking the conda detection function:

library(reticulate)
assignInNamespace("is_conda_python", function(x){ return(FALSE) }, ns="reticulate")
reticulate:::use_python("/usr/local/bin/python")   ## failed on normalizePath error previously

It would be nice if this could be set via environment variable. e.g. Sys.getenv("RETICULATE_IS_CONDA") or some such

t-kalinowski commented 8 months ago

@mtekman Looking at the base biocontainer image, I see the conda binary is provided: https://github.com/BioContainers/containers/blob/master/biocontainers/1.2.0/Dockerfile

Can you give an example of a conda environment where the conda binary that created the env is no longer available? How do you do the equivalent of conda activate in that situation?

t-kalinowski commented 8 months ago

Note, there were some tweaks to this code path via #1543, and this might be resolved already.

Also, we have another issue where a conda environment is being used in a context where the original conda binary that created the env is not available: #1542

mtekman commented 8 months ago

Can you give an example of a conda environment where the conda binary that created the env is no longer available? How do you do the equivalent of conda activate in that situation?

@t-kalinowski The best I can do is give you same environment I was working with, but it's long and convoluted:

## Install Planemo
python3 -m venv planemo
.  planemo/bin/activate
pip install planemo

## Clone tools-iuc
git clone https://github.com/galaxyproject/tools-iuc
cd tools-iuc/tools/sceasy

## Enable docker and build a biocontainer from the sceasy recipe
sudo systemctl  start docker
## conda should also be auto-installed at this step, but if not, install micromamba
planemo test --biocontainers sceasy.xml --no-cleanup    ## persist the containers

The full docker command should be printed somewhere in the wall of text you see when running planemo, but just in case you can't find it:

## Find the mulled environment
ls ~/micromamba/envs/mulled-v*

let's assume the environment is called: "mulled-v2-c123456"

then there should be a docker image of that same name created on your machine, which you can debug via

docker run -it --name mytest --rm --user 1000:1000 \
  quay.io/local/mulled-v2-c123456:sometag /usr/bin/bash

at this point you're in the environment and can test

R
library(retriculate)
reticulate:::use_python("/usr/local/bin/python")

I think for this particular container, it will not fail because of the overrides that I put into the sceasy.xml recipe.

Disable the override, and run planemo test again, and it should build you a container that has the problematic conda install

Hope that helps!

Related: https://github.com/galaxyproject/tools-iuc/pull/5519

ollieeknight commented 8 months ago

I found this fix here. I know it doesn't fix the root cause, but perhaps it's a temporary solution until it is solved:

assignInNamespace("is_conda_python", function(x){ return(FALSE) }, ns="reticulate")
mtekman commented 8 months ago

@ollieeknight if you scroll up, you can see that we're in the very thread you're linking to :D

ollieeknight commented 8 months ago

@ollieeknight if you scroll up, you can see that we're in the very thread you're linking to :D

wow, that's embarrassing 😅 I spent so long trying to fix this yesterday on the cluster than I'm working on that I think it fried my brain. apologies!

mtekman commented 8 months ago

hah, I have been there!

t-kalinowski commented 7 months ago

@mtekman Is there any conda binary available in your environment? After #1555, reticulate allows for activating a condaenv with a different conda binary than the one that created it. Does that fix your issue?

mtekman commented 7 months ago

@t-kalinowski There is no conda binary in the environment, so I don't think the fix would actually work.

Conda was used to populate the Docker image system libraries (prefix: /usr/local/), but inside the container, there is no conda binary.

All that exists inside the container is a bunch of JSON files at the /usr/local/conda-meta/ directory, which is separately triggering the is_condaenv function at L837 in conda.R and the is_conda_python function at L1026

I want to share the docker container I have with you but it's 2.9GB file and there is no accompanying Dockerfile that I can find that would reproduce the container exactly.

The instructions I used above will generate the Docker image I have though

mtekman commented 7 months ago

One telling sign I can see between a regular conda installation, and one generated by this container is that the history file is different.

In a regular conda installation there is:

> cat ~/micromamba/conda-meta/history | head -1

# cmd: /home/user/.local/bin/micromamba install \
#      -c bioconda -c conda-forge bioconductor-biostrings

whereas one in the container would yield:

> cat /usr/local/conda-meta/history | head -1

# cmd: /opt/conda/bin/mamba install \
#      -c conda-forge -c bioconda bioconductor-biostrings \
#      --strict-channel-priority -p /usr/local --copy --yes --quiet

Though, I can see that this isn't a silver bullet for whether or not a biocontainer contains conda or not...