Open vladjohnson opened 2 months ago
Thanks for reporting the issue; this behavior is definitely not expected.
Especially with setting RAY_DATA_DISABLE_PROGRESS_BARS=1
, this should definitely disable the progress bar.
My first thought is that you should pass this env var into the ray runtime env. For example, if you are using ray.init()
, you can pass it into env_vars
(see the docs). This will ensure all workers get this env var, and disables progress bars properly.
You can also explicitly set the variable in DataContext
:
ctx = ray.data.DataContext.get_current()
ctx.enable_progress_bars = False
If the above doesn't work, I also have a few other temporary fixes to suggest:
(1)
ctx = ray.data.DataContext.get_current()
ctx.use_ray_tqdm = False
This disables the special tqdm implementation for distributed settings, which Ray Data uses to manage progress bars across multiple workers.
(2) Another temporary workaround that might work:
ctx.enable_operator_progress_bars = False
this disables operator-level progress bars, so that it only shows the top-level global progress bar. Although this won't resolve the issue completely, it will at least help reduce the output spam.
Thank you so much, @scottjlee! Highly appreciated
What happened + What you expected to happen
Hey guys, looking for the way to fix this mess... tqdm is creating a bunch of progress bars and my logs keep growing the notebook to a massive size. I've tried setting RAY_DATA_DISABLE_PROGRESS_BARS=1, but that did not help. How do I either turn off the progress bars or ideally, make them work as they are supposed to (one single progress bar)?
Thanks
Versions / Dependencies
Ray version: 2.35.0 Python version: 3.11.9 OS: Ubuntu 20.04
Reproduction script
Issue Severity
Medium: It is a significant difficulty but I can work around it.