ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.76k stars 5.74k forks source link

[RLlib] Flatten observations example doesn't work #47127

Open rubenjacob opened 2 months ago

rubenjacob commented 2 months ago

What happened + What you expected to happen

Running the example script rllib/examples/connectors/flatten_observations_dict_space.py raises an error because the order of Connectors in the env_to_module_pipeline is wrong. The FlattenObservations connector assumes that there's already an observation available in the data. However, the observation is extracted from the episode by AddObservationsFromEpisodesToBatch which comes after FlattenObservations in the pipeline.

This causes the following error during training:

File "C:\projects\rl\.venv\Lib\site-packages\ray\tune\trainable\trainable.py", line 328, in train
    result = self.step()
             ^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\algorithms\algorithm.py", line 873, in step
    train_results, train_iter_ctx = self._run_one_training_iteration()
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\algorithms\algorithm.py", line 3155, in _run_one_training_iteration
    results = self.training_step()
              ^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\algorithms\ppo\ppo.py", line 424, in training_step
    return self._training_step_new_api_stack()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\algorithms\ppo\ppo.py", line 445, in _training_step_new_api_stack
    episodes, env_runner_results = synchronous_parallel_sample(
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\execution\rollout_ops.py", line 95, in synchronous_parallel_sample
    sampled_data = [worker_set.local_worker().sample(**random_action_kwargs)]
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\env\multi_agent_env_runner.py", line 155, in sample
    samples = self._sample_timesteps(
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\env\multi_agent_env_runner.py", line 257, in _sample_timesteps
    to_module = self._cached_to_module or self._env_to_module(
                                          ^^^^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\connectors\env_to_module\env_to_module_pipeline.py", line 25, in __call__
    return super().__call__(
           ^^^^^^^^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\connectors\connector_pipeline_v2.py", line 84, in __call__
    data = connector(
           ^^^^^^^^^^
  File "C:\projects\rl\.venv\Lib\site-packages\ray\rllib\connectors\env_to_module\flatten_observations.py", line 176, in __call__
    raise ValueError(
ValueError: `batch` must already have a column named obs in it for this connector to work!

The error doesn't occur if FlattenObservations is added to the pipeline in between the default connectors:

def _env_to_module_pipeline(env):
    return [
        AddObservationsFromEpisodesToBatch(),
        AddStatesFromEpisodesToBatch(),
        FlattenObservations(multi_agent=args.num_agents > 0),
        BatchIndividualItems(),
        NumpyToTensor()
    ]
# ...
base_config = (
        get_trainable_cls(args.algo)
        .get_default_config()
        .environment("env")
        .env_runners(env_to_module_connector=_env_to_module_pipeline, add_default_connectors_to_env_to_module_pipeline=False)
# ...

In general, it should be possible to define a fixed order for all Connectors provided by Ray and to then always put them in that order. It would also be nice to have the possibility to insert a custom Connector somewhere between the default ones without having to copy-paste the entire pipeline.

Versions / Dependencies

Ray 2.30, Python 3.11.9, Windows 10

Reproduction script

python ray/rllib/examples/connectors/flatten_observations_dict_space.py --enable-new-api-stack --no-tune --num-env-runners=0

Issue Severity

Low: It annoys or frustrates me.

simonsays1980 commented 1 month ago

@rubenjacob Thanks for rising this. I run the same example on master and have no error. Could you retry on master, too?