ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
34.18k stars 5.8k forks source link

[RLlib] Action Masking example (new API) not working #47648

Closed adadelta closed 2 months ago

adadelta commented 2 months ago

What happened + What you expected to happen

While running the action masking example for the new API found here, we get the AttributeError: AttributeError: 'super' object has no attribute '_compute_values'

This is the result of _compute_values having been removed in here.

Versions / Dependencies

Tested on newest release (2.35.0) and nightly build.

Reproduction script

Script copied from the current example:

from gymnasium.spaces import Box, Discrete

from ray.rllib.algorithms.ppo import PPOConfig
from ray.rllib.core.rl_module.rl_module import RLModuleSpec
from ray.rllib.examples.envs.classes.action_mask_env import ActionMaskEnv
from ray.rllib.examples.rl_modules.classes.action_masking_rlm import (
    ActionMaskingTorchRLModule,
)

from ray.rllib.utils.test_utils import (
    add_rllib_example_script_args,
    run_rllib_example_script_experiment,
)

parser = add_rllib_example_script_args(
    default_iters=10,
    default_timesteps=100000,
    default_reward=150.0,
)

if __name__ == "__main__":
    args = parser.parse_args()

    if args.algo != "PPO":
        raise ValueError("This example only supports PPO. Please use --algo=PPO.")

    base_config = (
        PPOConfig()
        .api_stack(
            # This example runs only under the new pai stack.
            enable_rl_module_and_learner=True,
            enable_env_runner_and_connector_v2=True,
        )
        .environment(
            env=ActionMaskEnv,
            env_config={
                "action_space": Discrete(100),
                # This defines the 'original' observation space that is used in the
                # `RLModule`. The environment will wrap this space into a
                # `gym.spaces.Dict` together with an 'action_mask' that signals the
                # `RLModule` to adapt the action distribution inputs for the underlying
                # `PPORLModule`.
                "observation_space": Box(-1.0, 1.0, (5,)),
            },
        )
        .rl_module(
            model_config_dict={
                "post_fcnet_hiddens": [64, 64],
                "post_fcnet_activation": "relu",
            },
            # We need to explicitly specify here RLModule to use and
            # the catalog needed to build it.
            rl_module_spec=RLModuleSpec(
                module_class=ActionMaskingTorchRLModule,
            ),
        )
        .evaluation(
            evaluation_num_env_runners=1,
            evaluation_interval=1,
            # Run evaluation parallel to training to speed up the example.
            evaluation_parallel_to_training=True,
        )
    )

    # Run the example (with Tune).
    run_rllib_example_script_experiment(base_config, args)

Error from stack trace: AttributeError: 'super' object has no attribute '_compute_values'

Issue Severity

High: It blocks me from completing my task.

PhilippWillms commented 2 months ago

Hi @adadelta, kindly check if your observation is the same as I reported in issue #47361 .

adadelta commented 2 months ago

Hi @adadelta, kindly check if your observation is the same as I reported in issue #47361 .

Hi @PhilippWillms . Yes it seems so. I'm going to close this one.