[RLlib|multiprocessing] AttributeError: 'Algorithm' object has no attribute 'config' when using ray.multiprocessing.Pool

Kakadus commented 2 months ago

What happened + What you expected to happen

I want to use an Algorithm with ray's multiprocessing pool to execute an algorithm distributed in the ray cluster. Unpickeling the object fails, because config has not been initialized on the cloned algorithm instance.

I expect algorithms to work with ray's multiprocessing pool.

Versions / Dependencies

ray, version 2.35.0

and

ray, version 3.0.0.dev0 (master)

Reproduction script

from ray.rllib.algorithms import PPOConfig
from ray.util.multiprocessing import Pool

def print_with_algo(x):
    """Cause algo to be serialized."""
    print(algo)
    print(x)

algo = PPOConfig().environment(env="Humanoid-v4").build()

pool = Pool()
for _ in pool.imap_unordered(print_with_algo, range(10)):
    pass
pool.close()

traceback:

(PoolActor pid=...) 'PPO' object has no attribute 'config'
(PoolActor pid=...) Traceback (most recent call last):
(PoolActor pid=...)   File "/home/.../.cache/pypoetry/virtualenvs/ray-f4XCQ9mO-py3.12/lib/python3.12/site-packages/ray/_private/serialization.py", line 423, in deserialize_objects
(PoolActor pid=...)     obj = self._deserialize_object(data, metadata, object_ref)
(PoolActor pid=...)           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(PoolActor pid=...)   File "/home/.../.cache/pypoetry/virtualenvs/ray-f4XCQ9mO-py3.12/lib/python3.12/site-packages/ray/_private/serialization.py", line 280, in _deserialize_object
(PoolActor pid=...)     return self._deserialize_msgpack_data(data, metadata_fields)
(PoolActor pid=...)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(PoolActor pid=...)   File "/home/.../.cache/pypoetry/virtualenvs/ray-f4XCQ9mO-py3.12/lib/python3.12/site-packages/ray/_private/serialization.py", line 235, in _deserialize_msgpack_data
(PoolActor pid=...)     python_objects = self._deserialize_pickle5_data(pickle5_data)
(PoolActor pid=...)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(PoolActor pid=...)   File "/home/.../.cache/pypoetry/virtualenvs/ray-f4XCQ9mO-py3.12/lib/python3.12/site-packages/ray/_private/serialization.py", line 223, in _deserialize_pickle5_data
(PoolActor pid=...)     obj = pickle.loads(in_band, buffers=buffers)
(PoolActor pid=...)           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(PoolActor pid=...)   File "/home/.../.cache/pypoetry/virtualenvs/ray-f4XCQ9mO-py3.12/lib/python3.12/site-packages/ray/rllib/algorithms/algorithm.py", line 3336, in __setstate__
(PoolActor pid=...)     if self.config.enable_env_runner_and_connector_v2:
(PoolActor pid=...)        ^^^^^^^^^^^
(PoolActor pid=...) AttributeError: 'PPO' object has no attribute 'config'

Issue Severity

High: It blocks me from completing my task.

simonsays1980 commented 2 months ago

@Kakadus thanks for raising this issue. I am not aware of what you want to achieve with Ray's multiprocessing pool. If it is running multiple PPO trials in parallel I would highly suggest using ray.tune for this.

Nevertheless, here is your example working:

from ray.rllib.algorithms import PPOConfig
from ray.util.multiprocessing import Pool

def print_with_algo(x):
    """Cause algo to be serialized."""
    algo = PPOConfig().environment(env="CartPole-v1").build()
    print(algo)
    print(x)

pool = Pool()
for _ in pool.imap_unordered(print_with_algo, range(10)):
    pass
pool.close()

The reason for the error before was that ray tried to serialize the PPO object with all its attributes which cannot be easily done. Building however the object in each task does work.

A word on the side: at RLlib we never use this multiprocessing tool, but instead have developed our own actors (e.g. SingleAgentEnvRunner or Learner that can be spawned by ray's low-level functions. To spawn complete trials we always use ray.tune with tune.grid_search() for RLlib config settings and/or num_samples defined in Tuner to run these many samples of a single config.

Kakadus commented 2 months ago

Thanks for your reply!

TIMHO, it would be good to document that one has to he reason for the error before was that ray tried to serialize the PPO object with all its attributes which cannot be easily done. Building however the object in each task does work.

technically, de-serializing fails because config has not been defined on the Algorithm.

I am not aware of what you want to achieve with Ray's multiprocessing pool. If it is running multiple PPO trials in parallel I would highly suggest using ray.tune for this.

I want to evaluate an algorithm, I restored from a checkpoint. This requires me to "carefully" step through the environment, and I want to use multiprocessing to speed it up. I'm aware of EnvRunners but found it easier to implement it directly using the "raw" environment and Algorithm.compute_single_action, as I could recycle code from early PoCs / baselines. Using __getstate__ and from_state and stopping the automatically spawned RolloutWorker / env_runner_group did the trick, but this is may be better placed in the forum. (I'm still on the old api)

Still, I think the underlying issue (de-serialization of Algorithms fail because of missing config) can be fixed or documented by ray

ray-project / ray