Closed defrag-bambino closed 1 year ago
Maybe you're pointing to an old --logdir
so it's trying to load a checkpoint that isn't compatible with the new version of your environment anymore?
Nope, I deleted them beforehand. I even tried with a freshly cloned repo.
Here's an excerpt of the last part of the error:
Creating new TensorBoard event file writer. Did not find any checkpoint. Writing checkpoint: logdir/run1/checkpoint.ckpt Start training loop. Saved chunk: 20230227T234456F327624-5imMN603dmNcTxUoqgfBmi-0000000000000000000000-76.npz Tracing policy function. Wrote checkpoint: logdir/run1/checkpoint.ckpt Error writing summary: stats/policy_image ---prior Traced<ShapedArray(float16[1,1024])>with<DynamicJaxprTrace(level=1/0)> ---embed Traced<ShapedArray(float16[1,1,1024])>with<DynamicJaxprTrace(level=1/0)>
I got the last two lines by printing the shape's of the prior['deter'] and embed arrays in nets.py line 94 (right before the concatenate call in line 95 fails). I am not quite sure what these two arrays are meant to contain, can you tell from this if these values seem off?
Ahh, I think I know the issue. When you use FromGym
on an environment that returns a single observation array, rather than a dictionary of observations, then it uses image
as the observation key by default. Later in the code, it then tries to write a video summary from this observation key, but it fails because the observation isn't shaped like an image. Can you try using FromGym(env, obs_key='vector')
instead of FromGym(env)
?
Hi, thanks for the reply. This is indeed a good point, I changed it. But this did apparently not resolve my issue. I think it is the following function call, for which the arguments do not have the same number of dimensions (see full error log above):
│ /home/fabian/Desktop/dreamerv3/dreamerv3/agent.py:56 in policy │ │ │ │ 53 │ obs = self.preprocess(obs) │ │ 54 │ (prev_latent, prev_action), task_state, explstate = state │ │ 55 │ embed = self.wm.encoder(obs) │ │ ❱ 56 │ latent, = self.wm.rssm.obs_step( │ │ 57 │ │ prev_latent, prev_action, embed, obs['is_first']) │ │ 58 │ self.expl_behavior.policy(latent, expl_state) │ │ 59 │ task_outs, task_state = self.task_behavior.policy(latent, task_state)
Can you share your full script (or ideally a simplified version), please?
Sure. I've created a fork and commited my changes to it. This also includes the Env, which in this case is Unity's PushBlock Example (using no visual observattions). I had to minimally adapt your code to get this Env to work. Check the diff to see them. https://github.com/defrag-bambino/dreamerv3-fork
Thanks but that's a bit too much custom code for me to have time to look into. Some ideas I can think of are that the environment does not return the shapes it claims in the observation space or again trying to load an incompatible checkpoint.
OK, thanks. I will try to debug it further and report back if there is any news.
Hi Danijar, I have found the issue. It was indeed my own code. I had written a wrapper that allowed to use multiple envs. Though I had it set to only one env for trying DreamerV3. However, the returned reward and observations were still a one-element list, instead of the actual observation. So basically [obs] instead of obs.
Thanks again for creating this awesome algorithm.
Hello, first of all I would also like to thank you for publicly sharing your research's code.
I am currently trying to run DreamerV3 on my custom environment, which was build using Unity3D's ML-Agents and wrapped as a Gym. After having some issues with the shape of my action and observation space, which I think fixed now, I am still having some issues with the dimensions of checkpoints. The issue occurs right at the beginning of training, when the Agent prefills its train dataset and the first checkpoint is saved. The full output is
here
```shell python3 example.py [UnityMemory] Configuration Parameters - Can be set up in boot.config "memorysetup-bucket-allocator-granularity=16" "memorysetup-bucket-allocator-bucket-count=8" "memorysetup-bucket-allocator-block-size=4194304" "memorysetup-bucket-allocator-block-count=1" "memorysetup-main-allocator-block-size=16777216" "memorysetup-thread-allocator-block-size=16777216" "memorysetup-gfx-main-allocator-block-size=16777216" "memorysetup-gfx-thread-allocator-block-size=16777216" "memorysetup-cache-allocator-block-size=4194304" "memorysetup-typetree-allocator-block-size=2097152" "memorysetup-profiler-bucket-allocator-granularity=16" "memorysetup-profiler-bucket-allocator-bucket-count=8" "memorysetup-profiler-bucket-allocator-block-size=4194304" "memorysetup-profiler-bucket-allocator-block-count=1" "memorysetup-profiler-allocator-block-size=16777216" "memorysetup-profiler-editor-allocator-block-size=1048576" "memorysetup-temp-allocator-size-main=4194304" "memorysetup-job-temp-allocator-block-size=2097152" "memorysetup-job-temp-allocator-block-size-background=1048576" "memorysetup-job-temp-allocator-reduction-small-platforms=262144" "memorysetup-temp-allocator-size-background-worker=32768" "memorysetup-temp-allocator-size-job-worker=262144" "memorysetup-temp-allocator-size-preload-manager=262144" "memorysetup-temp-allocator-size-nav-mesh-worker=65536" "memorysetup-temp-allocator-size-audio-worker=65536" "memorysetup-temp-allocator-size-cloud-worker=32768" "memorysetup-temp-allocator-size-gfx=262144" [WARNING] The environment contains multiple observations. You must define allow_multiple_obs=True to receive them all. Otherwise, only the first visual observation (or vector observation ifthere are no visual observations) will be provided in the observation. /home/fabian/miniconda3/envs/dreamerv3/lib/python3.8/site-packages/gym/spaces/box.py:73: UserWarning: WARN: Box bound precision lowered by casting to float32 logger.warn( Encoder CNN shapes: {} Encoder MLP shapes: {'image': (16,)} Decoder CNN shapes: {} Decoder MLP shapes: {'image': (16,)} JAX devices (1): [CpuDevice(id=0)] Policy devices: TFRT_CPU_0 Train devices: TFRT_CPU_0 Tracing train function. Optimizer model_opt has 16,451,344 variables. Optimizer actor_opt has 1,052,676 variables. Optimizer critic_opt has 1,181,439 variables. Logdir logdir/run1 Observation space: image Space(dtype=float32, shape=(16,), low=-inf, high=inf) reward Space(dtype=float32, shape=(), low=-inf, high=inf) is_first Space(dtype=bool, shape=(), low=False, high=True) is_last Space(dtype=bool, shape=(), low=False, high=True) is_terminal Space(dtype=bool, shape=(), low=False, high=True) Action space: action Space(dtype=float32, shape=(2,), low=-1.0, high=1.0) reset Space(dtype=bool, shape=(), low=False, high=True) Prefill train dataset. Episode has 61 steps and return 0.1. Episode has 55 steps and return 0.1. Episode has 73 steps and return 0.1. Episode has 55 steps and return 0.1. Episode has 74 steps and return 0.1. Episode has 96 steps and return 0.4. Episode has 56 steps and return 0.1. Episode has 50 steps and return 0.1. Episode has 50 steps and return 0.0. Episode has 45 steps and return 0.0. Episode has 62 steps and return 0.1. Episode has 76 steps and return 0.2. Episode has 53 steps and return 0.1. Episode has 57 steps and return 0.1. Episode has 84 steps and return 0.2. Saved chunk: 20230224T123638F338048-7fz2YQGpaWRhCvMc8sKBIS-4NtbeiuY5nlHsbS33AqTMd-1024.npz Episode has 69 steps and return 0.1. ──────────────────────────────────────────────────────────────────────────────────────────────────── Step 1100 ──────────────────────────────────────────────────────────────────────────────────────────────────── episode/length 69 / episode/score 0.13 / episode/sum_abs_reward 0.13 / episode/reward_rate 0 Creating new TensorBoard event file writer. Did not find any checkpoint. Writing checkpoint: logdir/run1/checkpoint.ckpt Start training loop. Saved chunk: 20230224T123818F856929-4NtbeiuY5nlHsbS33AqTMd-0000000000000000000000-76.npz Wrote checkpoint: logdir/run1/checkpoint.ckpt Error writing summary: stats/policy_image Tracing policy function. ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/fabian/Desktop/dreamerv3/example.py:53 inThanks