rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.65k stars 327 forks source link

Reticulate does not play nice with Rich progressbars #1632

Open jucor opened 2 weeks ago

jucor commented 2 weeks ago

Dear Reticulate team

Problem description

How could we get reticulate to play nicely with the progress bars from Rich, please? Right now, when calling Rich progress bars in a R notebook:

Reminder, for ease of finding the source, that Reticulate remaps the outputs here: https://github.com/rstudio/reticulate/blob/740169ae2a7c943115b2a4eca3b5a0b7d9e5ef3a/inst/python/rpytools/output.py#L130 which is called here: https://github.com/rstudio/reticulate/blob/740169ae2a7c943115b2a4eca3b5a0b7d9e5ef3a/R/output.R#L1

Setup for reproduction

The below steps are with R stack:

Python stack:

OS:

In my .Renviron:

RETICULATE_REMAP_OUTPUT_STREAMS=1

And the very first R cell in my R notebook to install the packages needed:

library(reticulate)
CENV="test_rich"

if(!condaenv_exists(CENV)) {
  conda_create(CENV, python_version="3.12")
  conda_install(CENV, c("rich"))
}
use_condaenv(CENV)

Single progress bar (mac only)

First problem on single progress bars: the output of the beginning of line is cut, missing the first 6 characters.

import time
from rich.progress import track

for i in track(range(20), description="Processing..."):
  time.sleep(.1)  # Simulate work being done

Note the letters "Process" are missing at the beginning of the lines. Only "sing" is showing.

Multiple progress bars (both mac and Windows)

Now let's try with multiple parallel bars, and observe that, instead of updating each line, a new line is shown at each update.

import time

from rich.progress import Progress

with Progress() as progress:

    task1 = progress.add_task("[red]Downloading...", total=1000)
    task2 = progress.add_task("[green]Processing...", total=1000)
    task3 = progress.add_task("[cyan]Cooking...", total=1000)

    while not progress.finished:
        progress.update(task1, advance=0.5)
        progress.update(task2, advance=0.3)
        progress.update(task3, advance=0.9)
        time.sleep(0.002)
image

The behaviour is exactly the same whether the python code is itself in a python chunk, or sourced from a python file by executing

reticulate::source_python("test_rich.py")

where test_rich.py is the python code listed above.

Expected behaviour

The expected behaviour is that obtained by running python test_rich.py, which shows the progress bar each on its line and updates them each on their own line: image

Use case

This bug is quite problematic when using Pymc in reticulate. Pymc relies on Rich progressbars since https://github.com/pymc-devs/pymc/pull/7233 . Those are long-running time bars (thus quite needed), and with many many steps thus which making massive outputs (either in console or in-line) and progressively slowing down Rstudio to a crawl.

Thanks a lot for any help!

jucor commented 1 week ago

I have found the root cause: Rstudio Console does not support cursor motion ANSI control sequences. I have opened a feature request upstream in https://github.com/rstudio/rstudio/issues/14942 , as well as an associated bug report https://github.com/rstudio/rstudio/issues/14941 .