Questions regarding mujoban environment

I have two questions regarding the mujoban environment:

Warnings and errors

Let us set up the environment as explained in the README and let's perform some random actions:

from dm_control import composer
from dm_control.locomotion import walkers
from physics_planning_games.mujoban.mujoban import Mujoban
from physics_planning_games.mujoban.mujoban_level import MujobanLevel
from physics_planning_games.mujoban.boxoban import boxoban_level_generator
import numpy as np

import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

walker = walkers.JumpingBallWithHead(add_ears=True, camera_height=0.25)
maze = MujobanLevel(boxoban_level_generator)
task = Mujoban(walker=walker,
               maze=maze,
               control_timestep=0.1,
               top_camera_height=96,
               top_camera_width=96)
env = composer.Environment(time_limit=1000, task=task)
for _ in range(200):
    env.reset()
    for _ in range(50):
        action = np.random.uniform(low=env.action_spec().minimum, high=env.action_spec().maximum)
        env.step(action)

This will spew various warnings, but I am mostly concerned about the following two:

WARNING:absl:Pre-allocated contact buffer is full. Increase nconmax above 100. Time = 0.0000.
WARNING:absl:Physics state is invalid. Warning(s) raised: mjWARN_CONTACTFULL

Apparently, the contact buffer is filled up and weird things start to happen. Occasionally, the walker will pass through a wall and an exception is thrown here: https://github.com/deepmind/deepmind-research/blob/2c7c401024c42c4fb1aa20a8b0471d2e6b480906/physics_planning_games/mujoban/mujoban.py#L389

How am I supposed to handle this? Should I simply increase nconmax? How would I go about doing that? The environment is not really usable for me right now.

Resetting of auxiliary episodes

In "Physically Embedded Planning Problems: New Challenges for Reinforcement Learning" the authors state that

Auxiliary task episodes are reset when the agent reaches the target or when the time limit of the auxiliary episode is reached

How should I interpret this statement? Does it mean that, whenever the abstract state is changed, the physical state is alterned in such a way that the walker and boxes are centered in their respective fields? I have not found any method in the environment that would perform this action.

Thank you for your help!

Best, Markus

For anyone experiencing a similar issue: If we do

task._arena.mjcf_model.size.nconmax = 200

it seems to be working. I doubt that this is the right way to do it, but it's quick fix.

Hi Markus,

I am not able to reproduce your error, could you please share on what system you are running and which version of the dependent libraries you are using? And please check if they match the requirements of each package.

Regarding your second question, this refers to how the agent is handling the auxiliary task and has no effect on the environment itself. The "reset" points to the reset of the auxiliary task episode which results in the agent receiving a new auxiliary target, while the main underlaying task remains the same. These are all done inside the agent by manipulating the observation and rewards and the environment itself is untouched. This repository contains only the environments and we have not included the agent code.

Thanks for the response! I have tried several different versions of dm_control, dm_env, labmaze etc., especially those around the publication date of the paper. Currently I am using more recent versions and I am still experiencing the issue.

I am running :

Ubuntu 20.04.2 LTS
dm-control==0.0.364896371
Mujoco 200

The dependencies seem to be satisfied and pip check says there are no conflicts. The dependencies of dm_control:

Requirement already satisfied: dm-control==0.0.364896371 in ./env/lib/python3.8/site-packages (0.0.364896371)
Requirement already satisfied: protobuf>=3.15.6 in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (3.17.0)
Requirement already satisfied: future in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (0.18.2)
Requirement already satisfied: requests in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (2.24.0)
Requirement already satisfied: setuptools!=50.0.0 in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (44.0.0)
Requirement already satisfied: dm-env in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (1.5)
Requirement already satisfied: scipy in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (1.5.4)
Requirement already satisfied: lxml in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (4.6.3)
Requirement already satisfied: pyopengl>=3.1.4 in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (3.1.5)
Requirement already satisfied: h5py in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (3.4.0)
Requirement already satisfied: tqdm in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (4.59.0)
Requirement already satisfied: absl-py>=0.7.0 in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (0.14.0)
Requirement already satisfied: glfw in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (2.1.0)
Requirement already satisfied: pyparsing in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (2.4.7)
Requirement already satisfied: numpy>=1.9.0 in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (1.19.5)
Requirement already satisfied: dm-tree!=0.1.2 in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (0.1.6)
Requirement already satisfied: labmaze in ./env/lib/python3.8/site-packages (from dm-control==0.0.364896371) (1.0.5)
Requirement already satisfied: six>=1.9 in ./env/lib/python3.8/site-packages (from protobuf>=3.15.6->dm-control==0.0.364896371) (1.16.0)
Requirement already satisfied: certifi>=2017.4.17 in ./env/lib/python3.8/site-packages (from requests->dm-control==0.0.364896371) (2020.12.5)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in ./env/lib/python3.8/site-packages (from requests->dm-control==0.0.364896371) (1.25.11)
Requirement already satisfied: chardet<4,>=3.0.2 in ./env/lib/python3.8/site-packages (from requests->dm-control==0.0.364896371) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in ./env/lib/python3.8/site-packages (from requests->dm-control==0.0.364896371) (2.10)

google-deepmind / deepmind-research

Questions regarding mujoban environment #307

Warnings and errors

Resetting of auxiliary episodes