rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.68k stars 328 forks source link

plotting broken in matplotlib 3.4.3: LinAlgError: Singular matrix #1078

Closed haraldschilly closed 2 years ago

haraldschilly commented 2 years ago

I'm running an Rmd file through rmarkdown::render to generate an HTML output. The strange detail is this breaks for the currently newest version of matplotlib. It works fine with matplotlib 3.4.2!!!

PS: I should add that plotting via matplotlib directly in python works fine, hence I think there is some sort of "api change" or "functionality tweak" going on, where something is off in the details how reticulate<->matplotlib interacts.

content:

library(reticulate)
use_python("/usr/bin/python3")
import numpy as np
import matplotlib.pyplot as plt
xx = np.linspace(0, 10, 1000)
yy = np.sin(2.3 * np.exp(-xx))
plt.plot(xx, yy)
plt.show()

the error:

processing file: python3.Rmd
Quitting from lines 34-40 (python3.Rmd) 
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  RuntimeError: Evaluation error: LinAlgError: Singular matrix

Detailed traceback:
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/pyplot.py", line 966, in savefig
    res = fig.savefig(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/figure.py", line 3015, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/backend_bases.py", line 2255, in print_figure
    result = print_method(
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/backend_bases.py", line 1669, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/backends/backend_agg.py", line 508, in print_png
    FigureCanvasAgg.draw(self)
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/backends/backend_agg.py", line 406, in draw
    self.figure.draw(self.renderer)
  File "/usr/local/lib/python3.8/dist-packages/matplotlib/artist.py", line 74, in draw_wrapper
    res
Calls: <Anonymous> ... py_capture_output -> force -> <Anonymous> -> py_call_impl
Execution halted

session info

## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] reticulate_1.22
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.7      here_1.0.1      lattice_0.20-45 png_0.1-7      
##  [5] rprojroot_2.0.2 digest_0.6.28   rappdirs_0.3.3  grid_4.1.1     
##  [9] R6_2.5.1        jsonlite_1.7.2  magrittr_2.0.1  evaluate_0.14  
## [13] rlang_0.4.12    stringi_1.7.5   jquerylib_0.1.4 Matrix_1.3-4   
## [17] bslib_0.3.1     rmarkdown_2.11  tools_4.1.1     stringr_1.4.0  
## [21] xfun_0.27       yaml_2.2.1      fastmap_1.1.0   compiler_4.1.1 
## [25] htmltools_0.5.2 knitr_1.36      sass_0.4.0

reticulate::py_config()

## python:         /usr/bin/python3
## libpython:      /usr/lib/python3.8/config-3.8-x86_64-linux-gnu/libpython3.8.so
## pythonhome:     //usr://usr
## version:        3.8.10 (default, Sep 28 2021, 16:10:42)  [GCC 9.3.0]
## numpy:          /usr/local/lib/python3.8/dist-packages/numpy
## numpy_version:  1.19.5
## 
## python versions found: 
##  /usr/bin/python3
##  /usr/bin/python
kevinushey commented 2 years ago

Could this be a matplotlib bug, as opposed to a reticulate one?

haraldschilly commented 2 years ago

Could this be a matplotlib bug, as opposed to a reticulate one?

Well, I don't know. Those few lines of python code do work fine in a jupyter notebook. Hence I lean towards assuming there was a change in matplotlib, which triggers some edge case or incompatibility in reticulate.

kevinushey commented 2 years ago

I'm not able to reproduce -- the following document knits fine for me.

---
title: "Untitled"
output: html_document
---

```{r setup, include=FALSE}
library(reticulate)
use_condaenv("r-reticulate", required = TRUE)
py_install("matplotlib")
import numpy as np
import matplotlib.pyplot as plt
xx = np.linspace(0, 10, 1000)
yy = np.sin(2.3 * np.exp(-xx))
plt.plot(xx, yy)
plt.show()


This is with matplotlib 3.5.1. If you can still reproduce with the latest version of matplotlib, please feel free to re-open with more details.
omerfyalcin commented 2 years ago

I'm having the exact same problem (I think) as @haraldschilly and I'd appreciate it if there's a known solution. Here's my code (the Rmd options inside curly brackets do have the three preceding backticks but I excluded them here to render properly):

---
title: "Untitled"
output: html_document
date: '2022-06-01'
---
{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(reticulate)
use_python("/usr/bin/python3")
{python}
import matplotlib.pyplot as plt
x = [i for i in range(10)]
y = [ 2*i for i in range(10)]

plt.plot(x, y)
plt.scatter(x, y)
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.show()

Here's the resulting error:

python_rmd

I updated (or made sure they are up to date):

with no luck. The code works fine in my local python environment when I call it from command line instead of through reticulate. It also worked fine in Google Colab.

haraldschilly commented 2 years ago

hi @omerfyalcin I've no workaround and I still see this problem.

t-kalinowski commented 2 years ago

Hi, I just tried locally but can't reproduce, neither on Mac with Python 3.10, nor Linux with Python 3.8. (In both I tried with R 4.2.0). Can you confirm you're on the latest numpy, matplotlib, and reticulate releases?

omerfyalcin commented 2 years ago

Thanks for letting me know, @haraldschilly.

I am on latest matplotlib (3.5.2) and reticulate (1.25). I realized I was on numpy 1.20, but have now updated to 1.22. I am on Linux with Python 3.8 and R 4.2.0.

After the numpy update, I am getting a different error. Something like this:

[7535:7535:20220601,162124.925655:ERROR process_memory_range.cc:86] read out of range
[7535:7535:20220601,162124.925778:ERROR elf_image_reader.cc:558] missing nul-terminator
[7535:7535:20220601,162124.925911:ERROR elf_dynamic_array_reader.h:61] tag not found
[7535:7535:20220601,162124.928216:ERROR elf_dynamic_array_reader.h:61] tag not found
[7535:7535:20220601,162124.928257:ERROR elf_dynamic_array_reader.h:61] tag not found
[7535:7535:20220601,162124.928296:ERROR elf_dynamic_array_reader.h:61] tag not found
....
....
....
....
[7535:7536:20220601,162124.957986:ERROR directory_reader_posix.cc:42] opendir: No such file or directory (2)

Also, when I change the code chunk option from "python" to "python3" it does start working, but the matplotlib plot pops up in a new window as opposed to in the document. (I know this is what it does in a regular python prompt, but I read that RStudio was updated at some point to make them appear within document.)

My rationale for using {python} as opposed to {python3} was that I thought specifying:

use_python("/usr/bin/python3")

would make RStudio and knitr understand what "python" refers to.

Thanks.

kevinushey commented 2 years ago

That error output implies that the R process is crashing / segfaulting; most likely when trying to load or use numpy.

See https://github.com/rstudio/reticulate/issues/922#issuecomment-1143854180 for a potential remedy.

omerfyalcin commented 2 years ago

Thanks, @kevinushey. Following your suggestion and the example provided at that comment, I installed numpy and matplotlib from source, then navigated to that virtual environment:

use_virtualenv(virtualenv = '/home/omer/.virtualenvs/r-reticulate/')

The issue persists. When I use "source" in shell and use python from command line in that same environment, as opposed to doing things in RStudio, I get no memory issue.

kevinushey commented 2 years ago

Then sorry, I am not sure -- we would need a gdb backtrace to know better what's going on. (The error implies that the R session is crashing; we'd need to see a stack trace at the time the crash is handled.)

omerfyalcin commented 2 years ago

Sure, thanks. Let me share the complete example. The "untitled.Rmd" file has this contents:

---
title: "Untitled"
output:
  html_document: default
  pdf_document: default
date: '2022-06-01'
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(reticulate)
use_python('/usr/bin/python3')
import matplotlib.pyplot as plt
x = [i for i in range(10)]
y = [2*i for i in range(10)]

plt.plot(x, y)
plt.scatter(x, y)
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.show()

Using an R session in the terminal, here is what happens with sessionInfo and the backtrace when I call  render on "untitled.Rmd":

R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics" Copyright (C) 2022 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

library(rmarkdown) sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rmarkdown_2.14

loaded via a namespace (and not attached): [1] compiler_4.2.0 fastmap_1.1.0 cli_3.3.0 tools_4.2.0 [5] htmltools_0.5.2 knitr_1.39 xfun_0.31 digest_0.6.29 [9] rlang_1.0.2 evaluate_0.15

rmarkdown::render('untitled.Rmd', output_format = 'html_document')

processing file: untitled.Rmd |.............. | 20% ordinary text without R code

|............................ | 40% label: setup (with options) List of 1 $ include: logi FALSE

|.......................................... | 60% ordinary text without R code

|........................................................ | 80% label: unnamed-chunk-1 (with options) List of 1 $ engine: chr "python"

caught segfault address 0x7f5f1e53e100, cause 'memory not mapped'

Traceback: 1: py_call_impl(callable, dots$args, dots$keywords) 2: builtins$eval(compiled, globals, locals) 3: force(expr) 4: py_capture_output(builtins$eval(compiled, globals, locals)) 5: py_compile_eval(snippet, compile_mode) 6: reticulate::eng_python(options) 7: engine(options) 8: in_dir(input_dir(), expr) 9: in_input_dir(engine(options)) 10: block_exec(params) 11: call_block(x) 12: process_group.block(group) 13: process_group(group) 14: withCallingHandlers(if (tangle) process_tangle(group) else process_group(group), error = function(e) { setwd(wd) cat(res, sep = "\n", file = output %n% "") message("Quitting from lines ", paste(current_lines(i), collapse = "-"), " (", knit_concord$get("infile"), ") ") }) 15: process_file(text, output) 16: knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet) 17: rmarkdown::render("untitled.Rmd", output_format = "html_document")

Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection:

t-kalinowski commented 2 years ago

My suspicion is that this is related to the numpy + R blas incompatibility issue. (https://github.com/rstudio/reticulate/issues/1190, and a few other threads).

If you build numpy from source, do you still get the segfault?

reticulate::virtualenv_create("r-reticulate", "/usr/bin/python3")
reticulate::py_install(
  "numpy",
  envname = "r-reticulate",
  pip = TRUE,
  pip_options = c("--force-reinstall", "--no-binary numpy")
)
reticulate::use_virtualenv("r-reticulate")
omerfyalcin commented 2 years ago

@t-kalinowski, thank you very much, it does work now after doing what you suggested using that code! @kevinushey suggested the same but I must have done something wrong then. Many thanks to both of you!

avishai987 commented 1 year ago

The same error happened to me. Somehow, duplicating the conda environment with conda create --name cloned_env --clone original_env solved the issue.