Closed cvanderaa closed 2 years ago
I have tried this and it seems to be caused by blas and numpy same as #471.
❯ docker run --rm -it rocker/rstudio@sha256:4b5ad6f7ada2d41bdbb307109f96a3bbf
598554dce1a8d6caf787fa713b5fdb0 bash
Unable to find image 'rocker/rstudio@sha256:4b5ad6f7ada2d41bdbb307109f96a3bbf598554dce1a8d6caf787fa713b5fdb0' locally
docker.io/rocker/rstudio@sha256:4b5ad6f7ada2d41bdbb307109f96a3bbf598554dce1a8d6caf787fa713b5fdb0: Pulling from rocker/rstudio
3b65ec22a9e9: Already exists
2707b2c1cab7: Already exists
4e941c18fe51: Already exists
7459532771b9: Already exists
36dd5cf718e7: Already exists
41c9cc3a5690: Already exists
ff4e965f0abc: Already exists
a0322e090a6e: Already exists
Digest: sha256:4b5ad6f7ada2d41bdbb307109f96a3bbf598554dce1a8d6caf787fa713b5fdb0
Status: Downloaded newer image for rocker/rstudio@sha256:4b5ad6f7ada2d41bdbb307109f96a3bbf598554dce1a8d6caf787fa713b5fdb0
root@bb36f35cf807:/# apt update -y && apt install -y python3-pip && python3 -m pip install scikit-learn
Get:1 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB]
Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1,275 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB]
Get:9 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [1,497 kB]
Get:10 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [892 kB]
Get:11 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [2,127 kB]
Get:12 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [27.5 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [1,404 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [30.3 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [2,415 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1,161 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [54.2 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [27.1 kB]
Fetched 23.1 MB in 5s (4,388 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
4 packages can be upgraded. Run 'apt list --upgradable' to see them.
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
build-essential dirmngr dpkg-dev fakeroot gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server
gpgconf gpgsm libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl libassuan0 libdpkg-perl
libexpat1-dev libfakeroot libfile-fcntllock-perl libksba8 liblocale-gettext-perl libnpth0 libpython3-dev
libpython3.8 libpython3.8-dev patch pinentry-curses python-pip-whl python3-dev python3-distutils python3-lib2to3
python3-pkg-resources python3-setuptools python3-wheel python3.8-dev zlib1g-dev
Suggested packages:
dbus-user-session libpam-systemd pinentry-gnome3 tor debian-keyring parcimonie xloadimage scdaemon bzr ed
diffutils-doc pinentry-doc python-setuptools-doc
The following NEW packages will be installed:
build-essential dirmngr dpkg-dev fakeroot gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server
gpgconf gpgsm libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl libassuan0 libdpkg-perl
libexpat1-dev libfakeroot libfile-fcntllock-perl libksba8 liblocale-gettext-perl libnpth0 libpython3-dev
libpython3.8 libpython3.8-dev patch pinentry-curses python-pip-whl python3-dev python3-distutils python3-lib2to3
python3-pip python3-pkg-resources python3-setuptools python3-wheel python3.8-dev zlib1g-dev
0 upgraded, 39 newly installed, 0 to remove and 4 not upgraded.
Need to get 12.9 MB of archives.
After this operation, 48.1 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal/main amd64 liblocale-gettext-perl amd64 1.07-4 [17.1 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal/main amd64 python3-pkg-resources all 45.2.0-1 [130 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libdpkg-perl all 1.19.7ubuntu3.2 [231 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal/main amd64 patch amd64 2.7.6-6 [105 kB]
Get:5 http://security.ubuntu.com/ubuntu focal-security/main amd64 gpgconf amd64 2.2.19-3ubuntu2.2 [124 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 dpkg-dev all 1.19.7ubuntu3.2 [679 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 build-essential amd64 12.8ubuntu1.1 [4,664 B]
Get:8 http://archive.ubuntu.com/ubuntu focal/main amd64 libassuan0 amd64 2.5.3-7ubuntu2 [35.7 kB]
Get:9 http://archive.ubuntu.com/ubuntu focal/main amd64 libksba8 amd64 1.3.5-2 [92.6 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal/main amd64 libnpth0 amd64 1.6-1 [7,736 B]
Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 libfakeroot amd64 1.24-1 [25.7 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal/main amd64 fakeroot amd64 1.24-1 [62.6 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal/main amd64 pinentry-curses amd64 1.1.0-3build1 [36.3 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal/main amd64 libalgorithm-diff-perl all 1.19.03-2 [46.6 kB]
Setting up libdpkg-perl (1.19.7ubuntu3.2) ...
Setting up zlib1g-dev:amd64 (1:1.2.11.dfsg-2ubuntu1.3) ...
Setting up gpgconf (2.2.19-3ubuntu2.2) ...
Setting up python-pip-whl (20.0.2-5ubuntu1.6) ...
Setting up python3-lib2to3 (3.8.10-0ubuntu1~20.04) ...
Setting up libalgorithm-diff-xs-perl (0.04-6) ...
Setting up liblocale-gettext-perl (1.07-4) ...
Setting up gpg (2.2.19-3ubuntu2.2) ...
Setting up libalgorithm-merge-perl (0.08-3) ...
Setting up gnupg-utils (2.2.19-3ubuntu2.2) ...
Setting up python3-distutils (3.8.10-0ubuntu1~20.04) ...
Setting up pinentry-curses (1.1.0-3build1) ...
Setting up gpg-agent (2.2.19-3ubuntu2.2) ...
Setting up python3-setuptools (45.2.0-1) ...
Setting up gpgsm (2.2.19-3ubuntu2.2) ...
Setting up dpkg-dev (1.19.7ubuntu3.2) ...
Setting up dirmngr (2.2.19-3ubuntu2.2) ...
Setting up libpython3.8-dev:amd64 (3.8.10-0ubuntu1~20.04.5) ...
Setting up python3-pip (20.0.2-5ubuntu1.6) ...
Setting up python3.8-dev (3.8.10-0ubuntu1~20.04.5) ...
Setting up gpg-wks-server (2.2.19-3ubuntu2.2) ...
Setting up build-essential (12.8ubuntu1.1) ...
Setting up libpython3-dev:amd64 (3.8.2-0ubuntu2) ...
Setting up gpg-wks-client (2.2.19-3ubuntu2.2) ...
Setting up python3-dev (3.8.2-0ubuntu2) ...
Setting up gnupg (2.2.19-3ubuntu2.2) ...
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
Collecting scikit-learn
Downloading scikit_learn-1.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.2 MB)
|████████████████████████████████| 31.2 MB 27.5 MB/s
Collecting scipy>=1.3.2
Downloading scipy-1.9.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (43.4 MB)
|████████████████████████████████| 43.4 MB 33.8 MB/s
Collecting numpy>=1.17.3
Downloading numpy-1.23.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
|████████████████████████████████| 17.1 MB 12.8 MB/s
Collecting threadpoolctl>=2.0.0
Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting joblib>=1.0.0
Downloading joblib-1.1.0-py2.py3-none-any.whl (306 kB)
|████████████████████████████████| 306 kB 17.5 MB/s
Installing collected packages: numpy, scipy, threadpoolctl, joblib, scikit-learn
Successfully installed joblib-1.1.0 numpy-1.23.2 scikit-learn-1.1.2 scipy-1.9.1 threadpoolctl-3.1.0
root@bb36f35cf807:/# R
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> install.packages("reticulate")
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
also installing the dependencies ‘rprojroot’, ‘Rcpp’, ‘RcppTOML’, ‘here’, ‘jsonlite’, ‘png’, ‘rappdirs’, ‘withr’
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/rprojroot_2.0.3.tar.gz'
Content type 'binary/octet-stream' length 100991 bytes (98 KB)
==================================================
downloaded 98 KB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/Rcpp_1.0.9.tar.gz'
Content type 'binary/octet-stream' length 4233796 bytes (4.0 MB)
==================================================
downloaded 4.0 MB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/RcppTOML_0.1.7.tar.gz'
Content type 'binary/octet-stream' length 1872918 bytes (1.8 MB)
==================================================
downloaded 1.8 MB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/here_1.0.1.tar.gz'
Content type 'binary/octet-stream' length 52508 bytes (51 KB)
==================================================
downloaded 51 KB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/jsonlite_1.8.0.tar.gz'
Content type 'binary/octet-stream' length 1158740 bytes (1.1 MB)
==================================================
downloaded 1.1 MB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/png_0.1-7.tar.gz'
Content type 'binary/octet-stream' length 58876 bytes (57 KB)
==================================================
downloaded 57 KB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/rappdirs_0.3.3.tar.gz'
Content type 'binary/octet-stream' length 47588 bytes (46 KB)
==================================================
downloaded 46 KB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/withr_2.5.0.tar.gz'
Content type 'binary/octet-stream' length 225454 bytes (220 KB)
==================================================
downloaded 220 KB
trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/reticulate_1.26.tar.gz'
Content type 'binary/octet-stream' length 3150177 bytes (3.0 MB)
==================================================
downloaded 3.0 MB
* installing *binary* package ‘rprojroot’ ...
* DONE (rprojroot)
* installing *binary* package ‘Rcpp’ ...
* DONE (Rcpp)
* installing *binary* package ‘jsonlite’ ...
* DONE (jsonlite)
* installing *binary* package ‘png’ ...
* DONE (png)
* installing *binary* package ‘rappdirs’ ...
* DONE (rappdirs)
* installing *binary* package ‘withr’ ...
* DONE (withr)
* installing *binary* package ‘RcppTOML’ ...
* DONE (RcppTOML)
* installing *binary* package ‘here’ ...
* DONE (here)
* installing *binary* package ‘reticulate’ ...
* DONE (reticulate)
The downloaded source packages are in
‘/tmp/RtmpRVNN8d/downloaded_packages’
> reticulate::repl_python()
No non-system installation of Python could be found.
Would you like to download and install Miniconda?
Miniconda is an open source environment management system for Python.
See https://docs.conda.io/en/latest/miniconda.html for more details.
Would you like to install Miniconda? [Y/n]: n
Installation aborted.
Python 3.8.10 (/usr/bin/python3)
Reticulate 1.26 REPL -- A Python interpreter in R.
Enter 'exit' or 'quit' to exit the REPL and return to R.
>>> import sklearn.impute
>>> X = [[0, 1, 3], [3, 4, 5]]
>>> gen = sklearn.metrics.pairwise_distances_chunked(X)
>>> next(gen)
*** caught segfault ***
address 0x7f7c49c0d000, cause 'memory not mapped'
Traceback:
1: py_call_impl(callable, dots$args, dots$keywords)
2: builtins$eval(compiled, globals, locals)
3: py_compile_eval(code, capture = FALSE)
4: doTryCatch(return(expr), name, parentenv, handler)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
6: tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), names[nh], parentenv, handlers[[nh]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(py_compile_eval(code, capture = FALSE), error = handle_error, interrupt = handle_interrupt)
11: repl()
12: doTryCatch(return(expr), name, parentenv, handler)
13: tryCatchOne(expr, names, parentenv, handlers[[1L]])
14: tryCatchList(expr, classes, parentenv, handlers)
15: tryCatch(repl(), interrupt = identity)
16: reticulate::repl_python()
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:
The method described here (switch to libblas) will solve the problem. https://rocker-project.org/images/versioned/r-ver.html#switching-blas-implementations
ARCH=$(uname -m)
update-alternatives --set "libblas.so.3-${ARCH}-linux-gnu" "/usr/lib/${ARCH}-linux-gnu/blas/libblas.so.3"
update-alternatives --set "liblapack.so.3-${ARCH}-linux-gnu" "/usr/lib/${ARCH}-linux-gnu/lapack/liblapack.so.3"
duplicate of #471
2. When will the changes in
devel
become available inlatest
?
devel
and latest
indicate the version of R.
https://rocker-project.org/images/versioned/r-ver.html#devel
https://github.com/rocker-org/rocker-versioned2#spacial-tags-for-daily-builds
Sorry for the duplicate issue and thanks a lot for the clarifications and the solution that fixes my issue! :pray:
Now that you mentioned it, it makes perfect sense that devel and latest follow the version of R :sweat_smile:
Yes, there are differences in the base image (jammy v.s. focal) and whether the R package is a source or binary install, but they should be basically the same except for the R version.
I do not know why the devel image avoids this bug. (I tried it, and it certainly didn't cause the bug)
@eitsupi the rocker/ml
images also have openblas turned off now, so maybe that's an option for @cvanderaa .
Curious that devel doesn't reproduce this though, since it is also using openblas....
Thanks for the advice, but @eitsupi's solution is what I needed since I want to use bioconductor/bioconductor_docker
that depends on rocker/rstudio
.
While my problem is solved, there still is something that is bugging me. I noticed that the python code (cf my first message) works fine in python directly instead of R/reticulate
in the same Docker container! Would there be an improved integration between R and python in the devel version?
I noticed that the python code (cf my first message) works fine in python directly instead of R/reticulate in the same Docker container
This is expected due to openblas, this is how reticulate
works. You are running python binaries that were not built against openblas, but calling them through reticulate which is following R's linking to openblas. I understand that this problem is avoidable, if for instance, python developers are careful enough about how they link BLAS libraries, or if openblas was built in a way to avoid exporting those symbols -- see https://github.com/numpy/numpy/issues/21643. Sorry I don't have a firmer grip on the issue -- it's not entirely clear if this issue can be or ought to be addressed by rocker, by reticulate, by numpy, or by openblas... It does sound like there have been changes made or planned by each that may be working their way down the pipelines though...
Insight from others on these details and suggestions about what we can do to remedy this are welcome! For the moment, I really recommend using rocker/ml
images for python work, which now simply opt out of openblas by default.
Maybe I may have missed it but did we discuss the possibility for these reticulate-based setups to use an R build that does not for once use external BLAS/LAPACK? Maybe one could building custom R binaries for in r-ver and configure with --without-lapack --without-blas
to avoid this?
From configure --help
:
[....]
Optional Packages:
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
--without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
--with-blas use system BLAS library (if available), or specify
it [no]
--with-lapack use system LAPACK library (if available), or specify
it [no]
[...]
@eddelbuettel thanks, but not quite sure I follow. You've always pointed out that it's not necessary to rebuild R in order to configure BLAS; if we build R with the a judicious choice of flags from the start than users can opt in or opt out of BLAS later without rebuilding R. That's what we've tried to do in r-ver
, with the additional step of opting in to openblas to give multi-threaded behavior out of the box. In the install_python.sh
scripts, we now opt back out of BLAS (this should be true for any image that adds python via those scripts): https://github.com/rocker-org/rocker-versioned2/pull/494/files#diff-99234a689e91b64348af0cc6dfed1e34c9731fcbc08a7ddbd70febe05d45c496#L45-L50
Is this not how you'd approach it? If we build R with --without-blas
, can users still opt into openblas (etc) with update-alternatives? would this mean not having a multi-threaded blas by default?
I know it's super confusing as I have been on the (very public) record of 'always' advocating --with-blas --with-lapack
so that one creates the external interface that allows hot-switching which I find highly valuable.
If you look closely, you will find almost as many statements by BDR advocating the opposite (and default !!) for the internal blas and lapack :) I use that rarely but for example in my local r-devel build here:
> sessionInfo()
R Under development (unstable) (2022-08-15 r82718)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /usr/local/lib/R-devel/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R-devel/lib/R/lib/libRlapack.so
These 'separate' libs are not visible to others and not callable by other by prevents "others" like Python to co-use them.
So you got to the core of it quickly: doing so may save reticulate use, but at the cost of multi-threaded work from a single R process. OTOH you avoid the 'cross-product' of threads issue when you spawn multiple R processes as they will only use their faithful internal BLAS. As always, choices, and tradeoffs. But it may avoid this one gnarly error.
ok cool, I'm right there with you. But I think the upshot here is that if R users want to use python and they install it with our /rocker_scripts/install_python.sh
method (installing system python from ubuntu repos), or simply use an image that already has run this script, they will get BLAS hot-swapped for them as well and things should work as expected.
R users that add python manually on top of rocker/rstudio will face opaque issues with BLAS (though I gather future versions of reticulate should warn about the blas conflict instead of segfaulting).
Since historically we have provided threaded-blas 'out-of-the-box' on the r-ver side, I think we should keep doing that, and remind our python users to consider rocker_scripts/install_python.sh
or using an already python-enabled image like rocker/binder
or rocker/ml
. We could no doubt document this all better somewhere.
Hello,
I want to report a segfault issue using a combination of the
rocker/rstudio:latest
(image id=983e55d98708
) +reticulate
+sklearn
... Thanks to the help of the Bioconductor team, we noticed that therocker/rstudio:devel
(image id=7fd0be55f1ff
) works properly. Hence I have two questions:devel
? I just want to bring this bug to light for later builds.devel
become available inlatest
?Thank you in advance for your insights
Reproducible example
My local setup
reticulate
:This code is useless, but it is a minimal example that reproduces the error I encounter with
sklearn.impute.KNNImputer
. The last command leads to the segfault error: