Open nick-youngblut opened 6 years ago
It appears that conda is using the wrong Rcpp
build:
$ R
R version 3.4.1 (2017-06-30) -- "Single Candle"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
> library(Rcpp)
Warning message:
package ‘Rcpp’ was built under R version 3.4.3
> library(dplyr)
Error: package or namespace load failed for ‘dplyr’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
namespace ‘Rcpp’ 0.12.14 is already loaded, but >= 0.12.15 is required
Interesting, but you are installing r-rcpp: 0.12.17-r341h9d2a408_1 conda-forge
so 0.12.17.
Is there anything else on your system that infers with it?
Yeah, r-rcpp: 0.12.17-r341h9d2a408_1
should be installed, but it doesn't appear to be used in my R sessions:
$ R
R version 3.4.1 (2017-06-30) -- "Single Candle"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Rcpp)
Warning message:
package ‘Rcpp’ was built under R version 3.4.3
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /ebio/abt3_projects/software/dev/miniconda3_dev/envs/test_env/lib/R/lib/libRblas.so
LAPACK: /ebio/abt3_projects/software/dev/miniconda3_dev/envs/test_env/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rcpp_0.12.14
loaded via a namespace (and not attached):
[1] compiler_3.4.1
I'm definitely calling R installed in the test conda environment.
hmmm... what am I missing?
My base conda env doesn't have Rcpp
installed at all, but Rcpp
(v0.12.14) will still load in an R session running from the base conda env
I checked .Libpaths()
, and it appears that Rcpp
is being loaded from an R install in my home directory. I changed my libpaths to just the conda install libpath. That fixed the issue, and I can now load dplyr correctly. This has never happened with my conda install previously, so I'm really surprised that it happened here. I should have checked .Libpaths()
prior to posting this issue.
I still wonder why Rcpp
defaulted to the version outside of my conda env even though Rcpp was installed in the conda env.
That is an R issue, afaik. R by default looks into your HOME directory. Nothing conda can do here I think. Or we patch R to not do so?
It's weird though that my older conda install (conda 4.3.29
with R 3.3.2
) doesn't include my home directory R install in the libPaths, but my newer version of conda (conda 4.5.9
with R 3.4.1
) does. So the old version of conda + R is not looking in my HOME directory. Why should R installed in a conda env look into a HOME directory? I thought the whole point of conda envs was to isolate the software in the envs.
All I know is that this is an R thing. Python does not have this problem.
OK. It does appear like a potential big issue if R is looking outside of the conda env. This could reduce the reproducibility of conda envs.
Any ideas on the best way to permanently remove my HOME R install libPath from the conda R libPaths? Permanently adding to the libPaths seems easier than permanently removing a specific libPath.
I tried downgrading r-base from R 3.4.1 to R 3.3.2. For R 3.3.2, my libPaths just includes the conda env, and not my HOME R install, so it's just the newer versions of r-base that look into the HOME directory for an R install.
I moved my HOME R install (instead of completely removing it), and all of my conda environments that use R 3.4.1 seem to have been relying on my HOME R install to some extent. So moving my R install seems to have broken all of my conda environments (at least for loading R packages), and this means that my conda envs were not fully isolated and thus they cannot be fully reproduced with a yaml file of the packages in the environment. I'm just glad I didn't completely remove my HOME dir R install...
Why should R installed in a conda env look into a HOME directory? I thought the whole point of conda envs was to isolate the software in the envs.
@nick-youngblut Yes, it is super frustrating that conda does not isolate itself by default. That is what everyone expects it to do, so usually the errors are more subtle. For example, one of your collaborators can install your conda environment defined by an environment.yaml
file, but still be using their local R packages to perform the analysis.
See https://github.com/conda-forge/r-base-feedstock/issues/37 for an extended discussion. Unfortunately the core conda team is not overly interested in fixing this, so I don't see anything changing anytime soon. It's a shame because this is a serious weakness.
All I know is that this is an R thing. Python does not have this problem.
@bgruening Not true. conda will also use user-installed Python packages over conda packages. It's just that it is less common for Python users to have user-installed packages compared to R users due to differences in pip
and install.packages()
. Unlike conda, venv
does properly ignore user-level Python packages. See https://github.com/conda-forge/python-feedstock/issues/171 for discussion.
@jdblischak this problem is only partially about packages. The problem is I still wonder why Rcpp defaulted to the version outside of my conda env even though Rcpp was installed in the conda env.
. I have never seen this with Python. There are multiple ways how you can disable this behavior in Python if you want to, as linked in the thread by you.
Do you know any fancy way how we can avoid this with R? Would be cool to know as there are many users having problems with this.
this problem is only partially about packages. The problem is I still wonder why Rcpp defaulted to the version outside of my conda env even though Rcpp was installed in the conda env.. I have never seen this with Python.
@bgruening That error is completely from a package path issue. The R user library was listed before the conda R library, so the wrong version of Rcpp was loaded. Python behaves the same way. If I install an old version of numpy in my Python user library and then load pandas that is installed in my conda environment, Python throws an error:
$ docker run --rm -it condaforge/linux-anvil
$ conda install -y pandas
$ python -c 'import pandas'
$ pip install --user 'numpy==1.8.2'
$ python -c 'import numpy;print( numpy.__version__)'
1.8.2
$ python -c 'import pandas'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/conda/lib/python3.6/site-packages/pandas/__init__.py", line 23, in <module>
from pandas.compat.numpy import *
File "/opt/conda/lib/python3.6/site-packages/pandas/compat/numpy/__init__.py", line 24, in <module>
'this pandas version'.format(_np_version))
ImportError: this version of pandas is incompatible with numpy < 1.9.0
your numpy version is 1.8.2.
Please upgrade numpy to >= 1.9.0 to use this pandas version
There are multiple ways how you can disable this behavior in Python if you want to, as linked in the thread by you.
Sure, there are also ways to get around it in R. But the point is that this requires manual intervention. And it also requires that users understand that their conda environments are not isolated from their user installation.
Do you know any fancy way how we can avoid this with R? Would be cool to know as there are many users having problems with this.
The solution analogous to Python setting export PYTHONNOUSERSITE=True
would be to define R_LIBS_USER=""
in ~/.Renviron
. But both of these options are quite suboptimal. A user will likely want to be able to load the user packages when running Python or R outside of conda. Having to set and unset environment variables is tedious and error-prone.
But I think that conda should handle this by itself (just like venv
). My proposed solution is to add a patch to r-base so that a conda-installed R ignores the default user library on that OS https://github.com/conda-forge/r-base-feedstock/issues/37#issuecomment-379859377. It's not fool-proof, but it will solve the most common problem.
From a normal Ubuntu 18.04 Docker container:
bag@bag:~$ pip install --user 'numpy==1.8.2'
bag@bag:~$ python -c 'import numpy;print( numpy.__version__)'
1.8.2
bag@bag:~$ conda create -n pandas -y pandas
bag@bag:~$ conda activate pandas
(pandas) bag@bag:~$ python -c 'import numpy;print( numpy.__version__)'
1.14.3
(pandas) bag@bag:~$ python -c 'import pandas'
But I guess I completely misunderstood the problem here. Nevermind, sorry for the noise.
From a normal Ubuntu 18.04 Docker container:
@bgruening I wasn't able to reproduce the behavior you observed. Using Ubuntu 18.04 and installing pandas in an environment (via conda activate
), I still get an error due to the user-installation of numpy:
docker run --rm -it ubuntu:bionic
# Install conda
apt update
apt install -y --no-install-recommends bzip2 ca-certificates curl
curl -s -L https://repo.continuum.io/miniconda/Miniconda3-4.5.11-Linux-x86_64.sh > miniconda.sh
openssl md5 miniconda.sh | grep e1045ee415162f944b6aebfe560b8fee
bash miniconda.sh -b -p /opt/conda
ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
source /opt/conda/etc/profile.d/conda.sh
conda activate
conda config --set show_channel_urls True
conda config --add channels conda-forge
conda update --all --yes
# Install old version of numpy
# - Had to use Python 3.6. Always got errors with Python 3.7
# - Had to install gcc with APT. Always got error with conda
conda install -y python=3.6
apt install -y gcc
pip install --user 'numpy==1.8.2'
python -c 'import numpy;print( numpy.__version__)'
# 1.8.2
# Create conda environment
conda create -n pandas -y pandas
conda activate pandas
python -c 'import numpy;print( numpy.__version__)'
# 1.8.2
python -c 'import pandas'
# Traceback (most recent call last):
# File "<string>", line 1, in <module>
# File "/opt/conda/envs/pandas/lib/python3.6/site-packages/pandas/__init__.py", line 23, in <module>
# from pandas.compat.numpy import *
# File "/opt/conda/envs/pandas/lib/python3.6/site-packages/pandas/compat/numpy/__init__.py", line 24, in <module>
# 'this pandas version'.format(_np_version))
# ImportError: this version of pandas is incompatible with numpy < 1.9.0
# your numpy version is 1.8.2.
# Please upgrade numpy to >= 1.9.0 to use this pandas version
But I guess I completely misunderstood the problem here. Nevermind, sorry for the noise.
No need to apologize! I think it is important to discuss and understand this issue with user-installed packages. It is so often over-looked, but it can cause conda users a lot of pain (e.g. this current issue).
Creating the conda env:
Testing:
Error generated: