Open justinvyu opened 5 months ago
Currently an object does not distinguish on good result / errored result. If we are gonna do this we need an extra marker on the Wait request reply for e.g. ProcessWaitRequestMessage
. But it looks like the Wait backend is still ObjectManagerService.Pull
which does not say it's a Wait or a Get. @rkooo567 do Wait
calls make the raylet receive data?
Description
Goal: From a list of ray remote task futures, I want to be able to check if each of these has errored without needing to call
ray.get
individually on each element.This feature is offered by similar async execution APIs:
Current workaround
We have a "check for failure" function in Ray Train, which may incur some unnecessary overhead to fetch objects: https://github.com/ray-project/ray/blob/fa61109f3fd26c543ad9a36794c8a478bc0a7113/python/ray/train/_internal/utils.py#L49-L58
Use case
I am implementing a control loop where I want to check on the status of some actor tasks every N seconds. I want to know if these actor tasks have failed as soon as possible so I can trigger some error handling. This involves me running an "error check" in a loop with a small amount of sleep time:
cc: @jjyao @rkooo567