nalepae / pandarallel

A simple and efficient tool to parallelize Pandas operations on all available CPUs
https://nalepae.github.io/pandarallel
BSD 3-Clause "New" or "Revised" License
3.59k stars 208 forks source link

is:issue is:open redirecting stdout not working with progress_bar=True #227

Closed luggie closed 1 year ago

luggie commented 1 year ago

I would need to redirect the progress-bar outputs to process them further like so:

with contextlib.redirect_stdout(PandarallelLogger()):
    df.parallel_apply(lambda x: x**2)

class PandarallelLogger(io.StringIO):
    def write(self, msg):
        print(msg)   
        self.flush()

    def flush(self):
        pass

which results in a :

Traceback (most recent call last):
  File ".../pandarallel_test.py", line 61, in apply
     df.parallel_apply(lambda x: x**2)
  File ".../pandarallel/core.py", line 242, in closure
    progress_bars = get_progress_bars(progresses_length, show_progress_bars)
  File ".../pandarallel/progress_bars.py", line 181, in get_progress_bars
    else ProgressBarsConsole(maxs, show)
  File ".../pandarallel/progress_bars.py", line 66, in __init__
    sys.stdout.write("\n".join(self.__lines))
  File ".../parallel/pandarallel_test.py", line 41, in write
    print(msg)
  File "...l/pandarallel_test.py", line 41, in write
    print(msg)
  File ".../parallel/pandarallel_test.py", line 41, in write
    print(msg)
  [Previous line repeated 1484 more times]
RecursionError: maximum recursion depth exceeded while calling a Python object

The only thing I would need is __bars to be somehow available from outside. Any idea on this anyone?

luggie commented 1 year ago

Nevermind, it had nothing to do with pandarallel - I just created a loop when redirecting stdout and printing to it inside it - genius me :D