ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.13k stars 5.61k forks source link

[core] Can't set working directory for runtime env in actor definition #30666

Open Ericxgao opened 1 year ago

Ericxgao commented 1 year ago

What happened + What you expected to happen

Trying to initialize an actor with a specified working directory in the runtime_env params errors out with:

ValueError: . is not a valid URI. Passing directories or modules to be dynamically uploaded is only supported at the job level (i.e., passed toray.init)

Initialized as: balancer = LoadBalancer.options(runtime_env={"working_dir": "."}).remote(2)

Confused on why this is complaining when the behavior is documented here? https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#specifying-a-runtime-environment-per-task-or-per-actor

Versions / Dependencies

Ray[default] @ 2.1.0

Reproduction script

@ray.remote(num_cpus=0, num_gpus=0)
class LoadBalancer:
    def __init__(self, num_actors):
        actors = [DiffusionRunner.remote() for _ in range(num_actors)]
        self.actor_pool = ray.util.ActorPool(actors)

    def run(self, settings):
        def f(actor, settings):
            return actor.run.remote(settings)

        print(self.actor_pool._idle_actors)
        self.actor_pool.submit(f, settings)

    def update_idle_actors(self):
        while self.actor_pool.has_next():
            try:
                self.actor_pool.get_next_unordered(timeout=1)
            except TimeoutError:
                pass

if __name__ == "__main__":
    parent_dir = os.path.dirname(os.path.abspath(__file__))
    os.environ["PYTHONPATH"] = parent_dir + ":" + os.environ.get("PYTHONPATH", "")
    print(os.environ["PYTHONPATH"])
    balancer = LoadBalancer.options(name="balancer", get_if_exists=True, lifetime="detached", namespace="diffusion", runtime_env={"working_dir": "."}).remote(2)
    print("Updating idle actors...")
    balancer.update_idle_actors.remote()
    print("Running diffusion...")
    balancer.run.remote(opt.settings)
    print("Done")

Issue Severity

Medium: It is a significant difficulty but I can work around it.

stephanie-wang commented 1 year ago

I believe it's OK to pass a runtime env to an actor, but according to the error message, you should not try to pass an absolute pathname for the current working directory. Other options (like a remote URI for the directory or pip packages) should be OK. Probably this is because uploading the absolute pathname will be expensive and might vary depending on which node the actor is placed on.

Does it work for you use case if you don't pass in the runtime env?