ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.53k stars 5.69k forks source link

Include repr() of the actor in actor task errors #17170

Closed richardliaw closed 3 years ago

richardliaw commented 3 years ago

Describe your feature request

In many cases, IP/hostname/pid are not very interpretable.

Often times, in a user program, actors can have nice interpretable labels (either pre-set or determined during runtime).

Ideally, we can allow the user to do the following:

Traceback (most recent call last):
...
ray.exceptions.RayTaskError: ray::...() (label="data_sharder_rank:1")
...
ray.exceptions.RayTaskError: ray::...() (label="trainer_rank:4")

and the exception can also programmatically capture this label, like:

exc.cause.cause.cause.{label,pid,ip}
ericl commented 3 years ago

What if we just included the repr(actor) in the message?

richardliaw commented 3 years ago

Ah, that's a good fix.