danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.31k stars 224 forks source link

unsupported dtype: object #36

Closed ipsec closed 1 year ago

ipsec commented 1 year ago

Hi Danijar,

First congrats by excellent work.

I'm trying to run dreamerv3 using a custom gym environment which have a observation image:

>>> env.observation_space
Box(0, 255, (64, 64, 3), uint8)

But I'm getting this error:

Encoder CNN shapes: {'image': (64, 64, 3)}
Encoder MLP shapes: {}
Decoder CNN shapes: {'image': (64, 64, 3)}
Decoder MLP shapes: {}
JAX devices (1): [CpuDevice(id=0)]
Policy devices: TFRT_CPU_0
Train devices:  TFRT_CPU_0
Tracing train function.
Optimizer model_opt has 181,562,755 variables.
Optimizer actor_opt has 9,457,674 variables.
Optimizer critic_opt has 9,708,799 variables.
Logdir /Users/fernando/logdir/run1
Observation space:
  image            Space(dtype=uint8, shape=(64, 64, 3), low=0, high=255)
  reward           Space(dtype=float32, shape=(), low=-inf, high=inf)
  is_first         Space(dtype=bool, shape=(), low=False, high=True)
  is_last          Space(dtype=bool, shape=(), low=False, high=True)
  is_terminal      Space(dtype=bool, shape=(), low=False, high=True)
Action space:
  action           Space(dtype=float32, shape=(10,), low=0, high=1)
  reset            Space(dtype=bool, shape=(), low=False, high=True)
Prefill train dataset.
/Users/fernando/Documents/dev/projects/dreamerv3/dreamerv3/embodied/envs/from_gym.py:72: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  obs = {k: np.asarray(v) for k, v in obs.items()}
{'image': array([[array([[[  0,   0,   0],
                [  0,   0,   0],
                [  0,   0,   0],
                ...,
                [ 75, 214,  16],
                [ 75, 214,  16],
                [ 75, 214,  16]],

               [[  0,   0,   0],
                [  0,   0,   0],
                [  0,   0,   0],
                ...,
                [ 75, 214,  16],
                [ 75, 214,  16],
                [ 75, 214,  16]],

               [[  0,   0,   0],
                [  0,   0,   0],
                [  0,   0,   0],
                ...,
                [ 75, 214,  16],
                [ 75, 214,  16],
                [ 75, 214,  16]],

               ...,

               [[  0,   0,   0],
                [  0,   0,   0],
                [  0,   0,   0],
                ...,
                [ 95, 211,  64],
                [ 95, 211,  64],
                [ 95, 211,  64]],

               [[  0,   0,   0],
                [  0,   0,   0],
                [  0,   0,   0],
                ...,
                [ 95, 211,  64],
                [ 95, 211,  64],
                [ 95, 211,  64]],

               [[  0,   0,   0],
                [  0,   0,   0],
                [  0,   0,   0],
                ...,
                [ 95, 211,  64],
                [ 95, 211,  64],
                [ 95, 211,  64]]], dtype=uint8), {}]], dtype=object), 'reward': array([0.], dtype=float32), 'is_first': array([ True]), 'is_last': array([False]), 'is_terminal': array([False])}
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/fernando/Documents/dev/projects/dreamerv3/myenv.py:56 in <module>                        │
│                                                                                                  │
│   53                                                                                             │
│   54                                                                                             │
│   55 if __name__ == '__main__':                                                                  │
│ ❱ 56   main()                                                                                    │
│   57                                                                                             │
│                                                                                                  │
│ /Users/fernando/Documents/dev/projects/dreamerv3/myenv.py:51 in main                            │
│                                                                                                  │
│   48   args = embodied.Config(                                                                   │
│   49 │     **config.run, logdir=config.logdir,                                                   │
│   50 │     batch_steps=config.batch_size * config.batch_length)                                  │
│ ❱ 51   embodied.run.train(agent, env, replay, logger, args)                                      │
│   52   # embodied.run.eval_only(agent, env, logger, args)                                        │
│   53                                                                                             │
│   54                                                                                             │
│                                                                                                  │
│ /Users/fernando/Documents/dev/projects/dreamerv3/dreamerv3/embodied/run/train.py:65 in train     │
│                                                                                                  │
│    62   print('Prefill train dataset.')                                                          │
│    63   random_agent = embodied.RandomAgent(env.act_space)                                       │
│    64   while len(replay) < max(args.batch_steps, args.train_fill):                              │
│ ❱  65 │   driver(random_agent.policy, steps=100)                                                 │
│    66   logger.add(metrics.result())                                                             │
│    67   logger.write()                                                                           │
│    68                                                                                            │
│                                                                                                  │
│ /Users/fernando/Documents/dev/projects/dreamerv3/dreamerv3/embodied/core/driver.py:42 in         │
│ __call__                                                                                         │
│                                                                                                  │
│   39   def __call__(self, policy, steps=0, episodes=0):                                          │
│   40 │   step, episode = 0, 0                                                                    │
│   41 │   while step < steps or episode < episodes:                                               │
│ ❱ 42 │     step, episode = self._step(policy, step, episode)                                     │
│   43                                                                                             │
│   44   def _step(self, policy, step, episode):                                                   │
│   45 │   assert all(len(x) == len(self._env) for x in self._acts.values())                       │
│                                                                                                  │
│ /Users/fernando/Documents/dev/projects/dreamerv3/dreamerv3/embodied/core/driver.py:49 in _step   │
│                                                                                                  │
│   46 │   acts = {k: v for k, v in self._acts.items() if not k.startswith('log_')}                │
│   47 │   obs = self._env.step(acts)                                                              │
│   48 │   print(obs)                                                                              │
│ ❱ 49 │   obs = {k: convert(v) for k, v in obs.items()}                                           │
│   50 │   assert all(len(x) == len(self._env) for x in obs.values()), obs                         │
│   51 │   acts, self._state = policy(obs, self._state, **self._kwargs)                            │
│   52 │   acts = {k: convert(v) for k, v in acts.items()}                                         │
│                                                                                                  │
│ /Users/fernando/Documents/dev/projects/dreamerv3/dreamerv3/embodied/core/driver.py:49 in         │
│ <dictcomp>                                                                                       │
│                                                                                                  │
│   46 │   acts = {k: v for k, v in self._acts.items() if not k.startswith('log_')}                │
│   47 │   obs = self._env.step(acts)                                                              │
│   48 │   print(obs)                                                                              │
│ ❱ 49 │   obs = {k: convert(v) for k, v in obs.items()}                                           │
│   50 │   assert all(len(x) == len(self._env) for x in obs.values()), obs                         │
│   51 │   acts, self._state = policy(obs, self._state, **self._kwargs)                            │
│   52 │   acts = {k: convert(v) for k, v in acts.items()}                                         │
│                                                                                                  │
│ /Users/fernando/Documents/dev/projects/dreamerv3/dreamerv3/embodied/core/basics.py:32 in convert │
│                                                                                                  │
│    29 │   │     value = value.astype(dst)                                                        │
│    30 │   │   break                                                                              │
│    31 │   else:                                                                                  │
│ ❱  32 │     raise TypeError(f"Object '{value}' has unsupported dtype: {value.dtype}")            │
│    33   return value                                                                             │
│    34                                                                                            │
│    35                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: Object '[[array([[[  0,   0,   0],
          [  0,   0,   0],
          [  0,   0,   0],
          ...,
          [ 75, 214,  16],
          [ 75, 214,  16],
          [ 75, 214,  16]],

         [[  0,   0,   0],
          [  0,   0,   0],
          [  0,   0,   0],
          ...,
          [ 75, 214,  16],
          [ 75, 214,  16],
          [ 75, 214,  16]],

         [[  0,   0,   0],
          [  0,   0,   0],
          [  0,   0,   0],
          ...,
          [ 75, 214,  16],
          [ 75, 214,  16],
          [ 75, 214,  16]],

         ...,

         [[  0,   0,   0],
          [  0,   0,   0],
          [  0,   0,   0],
          ...,
          [ 95, 211,  64],
          [ 95, 211,  64],
          [ 95, 211,  64]],

         [[  0,   0,   0],
          [  0,   0,   0],
          [  0,   0,   0],
          ...,
          [ 95, 211,  64],
          [ 95, 211,  64],
          [ 95, 211,  64]],

         [[  0,   0,   0],
          [  0,   0,   0],
          [  0,   0,   0],
          ...,
          [ 95, 211,  64],
          [ 95, 211,  64],
          [ 95, 211,  64]]], dtype=uint8) {}]]' has unsupported dtype: object

I suspect this is a gym version problem (my environment is using gym==0.26.2) your version gym==0.19.0 is not available to Mac M1.

Have you any idea how to fix this?

ipsec commented 1 year ago

Is not a gym version problem. I have installed the gym version 0.19.0 and the same error occur.

ipsec commented 1 year ago

My test is:

def main():

  import warnings
  import dreamerv3
  from dreamerv3 import embodied
  warnings.filterwarnings('ignore', '.*truncated to dtype int32.*')

  # See configs.yaml for all options.
  config = embodied.Config(dreamerv3.configs['defaults'])
  # config = config.update(dreamerv3.configs['medium'])
  config = config.update({
      'logdir': '~/logdir/run1',
      'run.train_ratio': 64,
      'run.log_every': 30,  # Seconds
      'batch_size': 16,
      'jax.prealloc': False,
      'encoder.mlp_keys': '$^',
      'decoder.mlp_keys': '$^',
      'encoder.cnn_keys': 'image',
      'decoder.cnn_keys': 'image',
      'jax.platform': 'cpu',
  })
  config = embodied.Flags(config).parse()

  logdir = embodied.Path(config.logdir)
  step = embodied.Counter()
  logger = embodied.Logger(step, [
      embodied.logger.TerminalOutput(),
      embodied.logger.TensorBoardOutput(logdir),
  ])

  import gym
  import gym_myenv
  from embodied.envs import from_gym
  env = gym.make("gym_myenv:MyEnv-v0")
  env = from_gym.FromGym(env, obs_key='image')  # Or obs_key='vector'.
  env = dreamerv3.wrap_env(env, config)
  env = embodied.BatchEnv([env], parallel=False)

  agent = dreamerv3.Agent(env.obs_space, env.act_space, step, config)
  replay = embodied.replay.Uniform(
      config.batch_length, config.replay_size, logdir / 'replay')
  args = embodied.Config(
      **config.run, logdir=config.logdir,
      batch_steps=config.batch_size * config.batch_length)
  embodied.run.train(agent, env, replay, logger, args)

if __name__ == '__main__':
  main()
ipsec commented 1 year ago

My mistake again, sorry