Receiving Error after about 2248 frames in 60fps - lab.observations()

Amir-Ramezani commented 7 years ago

Hello and thanks for your package.

I am trying to use your package with a DQN similar agent but I receive the following error after about 2248 times that I call the observations() in order to receive the state after 2248 steps. I also receive a similar error in the following simple changed demo code:

Error:

2246 2247 2248 obs = lab.observations() # dict of Numpy arrays RuntimeError: Environment in wrong status for call to observations() ERROR: Non-zero return code '1' from command: Process exited with status 1.

Main code that produced the error:

import deepmind_lab import numpy as np import cv2 import time

lab = deepmind_lab.Lab('nav_maze_random_goal_02', ['RGBD_INTERLACED'], config={ 'fps': str(60), 'width': str(320), 'height': str(240) } ) lab.reset()

noobaction = np.zeros([7], dtype=np.intc) forward_action = np.array([0, 0, 0, 1, 0, 0, 0], dtype=np.intc) backward_action = - forward_action look_left = np.array([20, 0, 0, 0, 0, 0, 0], dtype=np.intc) #this actually rotates right look_right = - look_left strafe_left = np.array([0, 0, 1, 0, 0, 0, 0], dtype=np.intc) #this actually slides righgt side strafe_right = - strafe_left

for action_counter in range(0,10000):

if(action_counter==0):
    reward = lab.step(noobaction, num_steps=4)
elif(action_counter>1 and action_counter<90):
    reward = lab.step(look_left, num_steps=4)
elif(action_counter>95 and action_counter<100):
    reward = lab.step(forward_action, num_steps=4)
else:
    reward = lab.step(noobaction, num_steps=4)

#if not lab.is_running():
#  print('Environment stopped early')
#  lab.reset()
obs = lab.observations()  # dict of Numpy arrays

rgb_i = obs['RGBD_INTERLACED']
assert rgb_i.shape == (240, 320, 4)
#print(rgb_i.shape)
rgb_image=np.array(rgb_i[0:240,0:320,0:3])
depth_image=np.array(rgb_i[0:240,0:320,3])
depth_image=1-depth_image
#print(npx)
cv2.imshow("RGB", rgb_image)
cv2.imshow("Depth", depth_image)
print(action_counter)
cv2.waitKey(1)

Please notice that I have commented the reset() function, I am able to reset the environment similar to your 'random agent' code but I guess I need to continue the episode or continue it for enf.

Please tell me what is the problem and what did I forget to make the code works.

Regards,

jingweiz commented 7 years ago

Hey, I'm getting the same error at around 1199 steps. My code structure is quite similar to the one from @AmirCognitive so it's not necessary to paste it here. What might cause this problem? Thanks in advance!

jingweiz commented 7 years ago

The only clues are:

this warning I got:

WARNING: Output base '/home/me/.cache/bazel/_bazel_me/2f9f084266361be2ea70187ad5c14f49' is on NFS. This may lead to surprising failures and undetermined behavior.

when I do ps aux | grep bazel, I get the following:

...    Ssl  bazel(deepnet) -XX:+HeapDumpOnOutOfMemoryError

if I call env.reset() every once in a while then this problem wouldn't occur. currently in the code I'm using where I get the error messages, env.reset() is only called once in the beginning.

bodgergely commented 7 years ago

if you look inside the file lab/python/dmlab_module.c you can see that this error is raised if the environment is not running anymore.

static PyObject Lab_observations(LabObject self) { PyObject result = NULL; PyArrayObject array = NULL;

if (!is_running(self)) { PyErr_SetString(PyExc_RuntimeError, "Environment in wrong status for call to observations()"); return NULL; }

Also if you start digging deeper you can see that is_running returns a type EnvCApi_EnvironmentStatus which can be EnvCApi_EnvironmentStatus_Terminated : EnvCApi_EnvironmentStatus_Running; etc My guess is that when the static EnvCApi_EnvironmentStatus dmlab_advance( void context, int num_steps, double reward) function was called on your 2248th iteration it returned EnvCApi_EnvironmentStatus_Terminate due to episode_ended being true. episode_ended = ctx->hooks.has_episode_finished( ctx->userdata, gc->total_engine_time_msec / (kEngineTimePerExternalTime * 1000.0));

Please look into the file /workspace/deepmind/lab/engine$ vim ./code/deepmind/dmlab_connect.c

My bet is that your environment timed out. The episode timeout can be set in the .lua files otherwise it seems to default to ./engine/context.cc:constexpr double kDefaultEpisodeLengthSeconds = 5 * 30.0;

charlesbeattie commented 6 years ago

There is a default time out. Please add the function to the level script you are running.

function api:hasEpisodeFinished(timeSeconds)
  return false
end

Also every call to step make sure env.is_running() is true and call reset if false.

tkoeppe commented 6 years ago

Please reopen if there are any further issues.

google-deepmind / lab