Closed R0B1NNN1 closed 1 year ago
It seems like the problem with the function reset
in mobile_env/wrappers/multi_agent.py
? It is really strange cause it works if I run the following :
import gymnasium
from ray.tune.registry import register_env
# use the mobile-env RLlib wrapper for RLlib
def register(config):
# importing mobile_env registers the included environments
import mobile_env
from mobile_env.wrappers.multi_agent import RLlibMAWrapper
env = gymnasium.make("mobile-small-ma-v0")
return RLlibMAWrapper(env)
# register the predefined scenario with RLlib
register_env("mobile-small-ma-v0", register)
import ray
# init ray with available CPUs (and GPUs) and init ray
ray.init(
num_cpus=5, # change to your available number of CPUs
include_dashboard=False,
ignore_reinit_error=True,
log_to_driver=False,
)
import ray.air
from ray.rllib.algorithms.ppo import PPOConfig
from ray.rllib.policy.policy import PolicySpec
from ray.tune.stopper import MaximumIterationStopper
# Create an RLlib config using multi-agent PPO on mobile-env's small scenario.
config = (
PPOConfig()
.environment(env="mobile-small-ma-v0")
# Here, we configure all agents to share the same policy.
.multi_agent(
policies={'shared_policy': PolicySpec()},
policy_mapping_fn=lambda agent_id, episode, worker, **kwargs: 'shared_policy',
)
# RLlib needs +1 CPU than configured below (for the driver/traininer?)
.resources(num_cpus_per_worker=4)
.rollouts(num_rollout_workers=1)
)
# Create the Trainer/Tuner and define how long to train
tuner = ray.tune.Tuner(
"PPO",
run_config=ray.air.RunConfig(
# Save the training progress and checkpoints locally under the specified subfolder.
storage_path="./CTDE_1m",
# Control training length by setting the number of iterations. 1 iter = 4000 time steps by default.
stop=MaximumIterationStopper(max_iter=1),
checkpoint_config=ray.air.CheckpointConfig(checkpoint_at_end=True),
),
param_space=config,
)
# Run training and save the result
result_grid = tuner.fit()
which I did not overwrite anything just use the default env.
I think the issue is in how you register and pass your custom Env to RLlib.
I'm also always a bit unsure how to do that. As a reference, here is how the pre-defined scenarios are registered: https://github.com/stefanbschneider/mobile-env/blob/main/mobile_env/scenarios/registry.py
You shouldn't use the same name for your new custom env as one of the existing env names (eg, "mobile-small-ma-v0").
@stefanbschneider : Thanks for replying, I actually tried to use a different name. It still show me the same problem.
from ray.tune.registry import register_env
# use the mobile-env RLlib wrapper for RLlib
def register(config):
# importing mobile_env registers the included environments
from mobile_env.wrappers.multi_agent import RLlibMAWrapper
env = Env1(config={"seed": 68},render_mode="rgb_array")
return RLlibMAWrapper(env)
# register the predefined scenario with RLlib
register_env("TEST1", register)
This is really strange. I am still testing it. Since as I mentioned in the other issue. I assigned one agent to each BS. And that works when I register my custom Env. So I do not know why this happen.
Thanks for replying.
@stefanbschneider
Hi, I found out why, it is because of the handler, because I clone the source code and tried to make some changes and forgot that the default from the base
file is central handler. So maybe we can close this issue for now.
Hello, @stefanbschneider @stwerner97
I met a problem when I define a new customer environment. Here is my code:
So when I try to overwrite the original env
MComCore
, I have the following bugs:But if I do not overwrite it, in other words if I use the default multi-agent env
mobile-small-ma-v0
this bug will not happen. I am wondering why?Thanks for replying in advance.