ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.95k stars 5.77k forks source link

[Core] Enable/Disable Ray Worker Logging via Toggle #47712

Open Innixma opened 1 month ago

Innixma commented 1 month ago

Description

I'd like to be able to enable/disable Ray worker logging via a repeatable toggle without needing to call ray.shutdown() followed by ray.init(log_to_driver=...), which takes a non-trivial amount of time.

Slack Thread on this Topic

Code Example: Colab

Note in the above code example, there is a private hack which allows to do this specifically for disabling ray logging. However, this is an irreversible process and logging cannot be re-enabled until ray.shutdown() is called, which isn't ideal.

Private Hack:

ray.init(log_to_driver=True)
# ray will log to driver

ray.­_private.ray­_logging.global­_worker­_stdstream­_dispatcher.remove­_handler("ray­_print­_logs")
# ray will not log to driver (irreversible)

I have implemented this private hack into AutoGluon to clean up our logging, but would prefer to have it be a toggle. Currently, if someone fits AutoGluon twice in a row, the 2nd fit call will never log with Ray. The only way to avoid it is to spend 7 seconds calling ray.shutdown().

Example Solution API:

ray.init(log_to_driver=True)
# ray will log to driver

ray.set_log_to_driver(False)
# ray will not log to driver

ray.set_log_to_drier(True)
# ray will log to driver

Use case

In AutoGluon, we have logic called DyStack which fits models in two phases. The first phase is in a ray subprocess to avoid memory leakage. We want to have ray logging in this phase so the user sees the output of the training. For the second phase, we fit models outside a ray subprocess, but still use ray for parallelizing tasks. In the second phase we don't want ray to produce logs, as it pollutes the log space and makes it harder to understand.

For an example of AutoGluon logging (using the ray private hack to fix logging), refer to the AutoGluon tutorial documentation in the Maximizing predictive performance cell and click "Show code cell output".

nikhilbalwani commented 1 month ago

I would love to have this feature as well. Thank you

nikhilbalwani commented 1 month ago

Hi guys, any updates on this? Thank you

jjyao commented 1 month ago

@nikhilbalwani what's your use case? Why do you need it?

nikhilbalwani commented 1 month ago

I am working on a package where i need to decide at runtime whether or not to enable/disable logging to the worker. At the same time I don't want to use _private ray APIs to achieve this.