google-deepmind / meltingpot

A suite of test scenarios for multi-agent reinforcement learning.
Apache License 2.0
577 stars 116 forks source link

AttributeError: module 'numpy' has no attribute '_no_nep50_warning' #164

Open neuronphysics opened 1 year ago

neuronphysics commented 1 year ago

Hello, I am attempting to integrate the MeltingPot environment as a benchmark for my Multi-Agent Reinforcement Learning (MARL) algorithm. The repository appears to be updated frequently. I ran the example script python3 -m meltingpot.examples.pettingzoo.sb3_train and I got the following error message

File "/lustre03/project/MAPPO/meltingpot/examples/pettingzoo/sb3_train.py", line 185, in <module>
    main()
  File "/lustre03/project/MAPPO/meltingpot/examples/pettingzoo/sb3_train.py", line 88, in main
    env = utils.parallel_env(env_config)
  File "/lustre03/project/MAPPO/meltingpot/examples/pettingzoo/utils.py", line 32, in parallel_env
    return _ParallelEnv(env_config, max_cycles)
  File "//lustre03/project/MAPPO/meltingpot/examples/pettingzoo/utils.py", line 118, in __init__
    _MeltingPotPettingZooEnv.__init__(self, env_config, max_cycles)
  File "/lustre03/project/MAPPO/meltingpot/examples/pettingzoo/utils.py", line 53, in __init__
    self._env = substrate.build(self.env_config, roles=self.env_config.default_player_roles)
  File "/lustre03/project/MAPPO//meltingpot/meltingpot/python/substrate.py", line 43, in build
    return get_factory(name).build(roles)
  File "/lustre03/project/MAPPO/meltingpot/meltingpot/python/substrate.py", line 66, in get_factory
    config = substrate_configs.get_config(name)
  File "/lustre03/project/MAPPO/meltingpot/meltingpot/python/configs/substrates/__init__.py", line 60, in get_config
    if substrate not in SUBSTRATES:
TypeError: unhashable type: 'ConfigDict'

Moreover, I tried to test the meltingpot environment work on Google Colab using the provided code. However, I continuously receive the following error:

!pip install "gymnasium[all]"
!pip install --upgrade pip setuptools wheel
!pip install dm-acme[jax,tf]
!pip install dm-acme[envs]
from ray.tune.analysis.experiment_analysis import ExperimentAnalysis
from typing import Tuple, Any, Mapping
import gymnasium
import dm_env
import dmlab2d
from gymnasium import spaces
from meltingpot import substrate
from meltingpot.utils.policies import policy
from ml_collections import config_dict
import numpy as np
from ray.rllib import algorithms
from ray.rllib.env import multi_agent_env
from ray.rllib.policy import sample_batch

import tree

PLAYER_STR_FORMAT = 'player_{index}'
_WORLD_PREFIX = 'WORLD.'

def timestep_to_observations(timestep: dm_env.TimeStep) -> Mapping[str, Any]:
  gym_observations = {}
  for index, observation in enumerate(timestep.observation):
    gym_observations[PLAYER_STR_FORMAT.format(index=index)] = {
        key: value
        for key, value in observation.items()
        if _WORLD_PREFIX not in key
    }
  return gym_observations

def remove_world_observations_from_space(
    observation: spaces.Dict) -> spaces.Dict:
  return spaces.Dict({
      key: observation[key] for key in observation if _WORLD_PREFIX not in key
  })

def spec_to_space(spec: tree.Structure[dm_env.specs.Array]) -> spaces.Space:
  """Converts a dm_env nested structure of specs to a Gym Space.

  BoundedArray is converted to Box Gym spaces. DiscreteArray is converted to
  Discrete Gym spaces. Using Tuple and Dict spaces recursively as needed.

  Args:
    spec: The nested structure of specs

  Returns:
    The Gym space corresponding to the given spec.
  if isinstance(spec, dm_env.specs.DiscreteArray):
    return spaces.Discrete(spec.num_values)
  elif isinstance(spec, dm_env.specs.BoundedArray):
    return spaces.Box(spec.minimum, spec.maximum, spec.shape, spec.dtype)
  elif isinstance(spec, dm_env.specs.Array):
    if np.issubdtype(spec.dtype, np.floating):
      return spaces.Box(-np.inf, np.inf, spec.shape, spec.dtype)
    elif np.issubdtype(spec.dtype, np.integer):
      info = np.iinfo(spec.dtype)
      return spaces.Box(info.min, info.max, spec.shape, spec.dtype)
    else:
      raise NotImplementedError(f'Unsupported dtype {spec.dtype}')
  elif isinstance(spec, (list, tuple)):
    return spaces.Tuple([spec_to_space(s) for s in spec])
  elif isinstance(spec, dict):
    return spaces.Dict({key: spec_to_space(s) for key, s in spec.items()})
  else:
    raise ValueError('Unexpected spec of type {}: {}'.format(type(spec), spec))

class MeltingPotEnv(multi_agent_env.MultiAgentEnv):
  """An adapter between the Melting Pot substrates and RLLib MultiAgentEnv."""

  def __init__(self, env: dmlab2d.Environment):
    """Initializes the instance.

    Args:
      env: dmlab2d environment to wrap. Will be closed when this wrapper closes.
    """
    self._env = env
    self._num_players = len(self._env.observation_spec())
    self._ordered_agent_ids = [
        PLAYER_STR_FORMAT.format(index=index)
        for index in range(self._num_players)
    ]
    # RLLib requires environments to have the following member variables:
    # observation_space, action_space, and _agent_ids
    self._agent_ids = set(self._ordered_agent_ids)
    # RLLib expects a dictionary of agent_id to observation or action,
    # Melting Pot uses a tuple, so we convert
    self.observation_space = self._convert_spaces_tuple_to_dict(
        spec_to_space(self._env.observation_spec()),
        remove_world_observations=True)
    self.action_space = self._convert_spaces_tuple_to_dict(
        spec_to_space(self._env.action_spec()))
    super().__init__()

  def reset(self, *args, **kwargs):
    """See base class."""
    timestep = self._env.reset()
    return timestep_to_observations(timestep), {}

  def step(self, action_dict):
    """See base class."""
    actions = [action_dict[agent_id] for agent_id in self._ordered_agent_ids]
    timestep = self._env.step(actions)
    rewards = {
        agent_id: timestep.reward[index]
        for index, agent_id in enumerate(self._ordered_agent_ids)
    }
    done = {'__all__': timestep.last()}
    info = {}

    observations = timestep_to_observations(timestep)
    return observations, rewards, done, done, info

  def close(self):
    """See base class."""
    self._env.close()

  def get_dmlab2d_env(self):
    """Returns the underlying DM Lab2D environment."""
    return self._env

  # Metadata is required by the gym `Env` class that we are extending, to show
  # which modes the `render` method supports.
  metadata = {'render.modes': ['rgb_array']}

  def render(self) -> np.ndarray:
    """Render the environment.

    This allows you to set `record_env` in your training config, to record
    videos of gameplay.

    Returns:
        np.ndarray: This returns a numpy.ndarray with shape (x, y, 3),
        representing RGB values for an x-by-y pixel image, suitable for turning
        into a video.
    """
    observation = self._env.observation()
    world_rgb = observation[0]['WORLD.RGB']

    # RGB mode is used for recording videos
    return world_rgb

  def _convert_spaces_tuple_to_dict(
      self,
      input_tuple: spaces.Tuple,
      remove_world_observations: bool = False) -> spaces.Dict:
    """Returns spaces tuple converted to a dictionary.

    Args:
      input_tuple: tuple to convert.
      remove_world_observations: If True will remove non-player observations.
    """
    return spaces.Dict({
        agent_id: (remove_world_observations_from_space(input_tuple[i])
                   if remove_world_observations else input_tuple[i])
        for i, agent_id in enumerate(self._ordered_agent_ids)
    })

def env_creator(env_config):
  """Outputs an environment for registering."""
  env_config = config_dict.ConfigDict(env_config)
  env = substrate.build(env_config['substrate'], roles=env_config['roles'])
  env = MeltingPotEnv(env)
  return env

experiment = ExperimentAnalysis(
      experiment_state="experiment_state-.json",
      default_metric="episode_reward_mean",
      default_mode="max")

config = experiment.best_config
env = env_creator(config["env_config"]).get_dmlab2d_env()
timestep = env.reset()
obs_spec = env.observation_spec()
shape = obs_spec[0]["WORLD.RGB"].shape
print(f"shape of observation space {shape}") 

This error message


     17 from ray.rllib.env import multi_agent_env
     18 from ray.rllib.policy import sample_batch

17 frames
[/usr/local/lib/python3.10/dist-packages/ray/rllib/__init__.py](https://localhost:8080/#) in <module>
      5 # Note: do not introduce unnecessary library dependencies here, e.g. gym.
      6 # This file is imported from the tune module in order to register RLlib agents.
----> 7 from ray.rllib.env.base_env import BaseEnv
      8 from ray.rllib.env.external_env import ExternalEnv
      9 from ray.rllib.env.multi_agent_env import MultiAgentEnv

[/usr/local/lib/python3.10/dist-packages/scipy/signal/__init__.py](https://localhost:8080/#) in <module>
    321 
    322 from ._bsplines import *
--> 323 from ._filter_design import *
    324 from ._fir_filter_design import *
    325 from ._ltisys import *

[/usr/local/lib/python3.10/dist-packages/scipy/signal/_filter_design.py](https://localhost:8080/#) in <module>
     14 from numpy.polynomial.polynomial import polyvalfromroots
     15 
---> 16 from scipy import special, optimize, fft as sp_fft
     17 from scipy.special import comb
     18 from scipy._lib._util import float_factorial

[/usr/local/lib/python3.10/dist-packages/scipy/optimize/__init__.py](https://localhost:8080/#) in <module>
    403 
    404 from ._optimize import *
--> 405 from ._minimize import *
    406 from ._root import *
    407 from ._root_scalar import *

[/usr/local/lib/python3.10/dist-packages/scipy/optimize/_minimize.py](https://localhost:8080/#) in <module>
     24 from ._trustregion_krylov import _minimize_trust_krylov
     25 from ._trustregion_exact import _minimize_trustregion_exact
---> 26 from ._trustregion_constr import _minimize_trustregion_constr
     27 
     28 # constrained minimization

[/usr/local/lib/python3.10/dist-packages/scipy/optimize/_trustregion_constr/__init__.py](https://localhost:8080/#) in <module>
      2 
      3 
----> 4 from .minimize_trustregion_constr import _minimize_trustregion_constr
      5 
      6 __all__ = ['_minimize_trustregion_constr']

[/usr/local/lib/python3.10/dist-packages/scipy/optimize/_trustregion_constr/minimize_trustregion_constr.py](https://localhost:8080/#) in <module>
      3 from scipy.sparse.linalg import LinearOperator
      4 from .._differentiable_functions import VectorFunction
----> 5 from .._constraints import (
      6     NonlinearConstraint, LinearConstraint, PreparedConstraint, strict_bounds)
      7 from .._hessian_update_strategy import BFGS

[/usr/local/lib/python3.10/dist-packages/scipy/optimize/_constraints.py](https://localhost:8080/#) in <module>
      6 from ._optimize import OptimizeWarning
      7 from warnings import warn, catch_warnings, simplefilter
----> 8 from numpy.testing import suppress_warnings
      9 from scipy.sparse import issparse
     10 

[/usr/local/lib/python3.10/dist-packages/numpy/testing/__init__.py](https://localhost:8080/#) in <module>
      9 
     10 from . import _private
---> 11 from ._private.utils import *
     12 from ._private.utils import (_assert_valid_refcount, _gen_alignment_data)
     13 from ._private import extbuild

[/usr/local/lib/python3.10/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in <module>
    411 
    412 
--> 413 @np._no_nep50_warning()
    414 def assert_almost_equal(actual, desired, decimal=7, err_msg='', verbose=True):
    415     """

[/usr/local/lib/python3.10/dist-packages/numpy/__init__.py](https://localhost:8080/#) in __getattr__(attr)
    309             return val
    310 
--> 311         if attr in __future_scalars__:
    312             # And future warnings for those that will change, but also give
    313             # the AttributeError

AttributeError: module 'numpy' has no attribute '_no_nep50_warning'.```
Could you please provide guidance on how to resolve these errors?
duenez commented 1 year ago

Unfortunately at the moment the stable baselines example is broken due to an inherent incompatibility between RLLib, Gym and Gymnasium. We are working in a way around this, but currently our priority is RLLib (because of the upcoming Melting Pot Competition).

The numpy issue seems to be a version incompatibility problem. At some point we had to pin the numpy version beacuse it broke RLLib. I think we might be OK unpinning now, if that would be useful.

richielo commented 11 months ago

Is there any update on this? or any workaround with specific pinned versions? I am trying to do the same thing with a torch codebase that works with gym environment. Getting stuck at TypeError: unhashable type: 'ConfigDict'