A question about the step function

chaobiubiu commented 5 years ago

Assuming that the current state is s0, what should I do when I only want to get the latter states s1 s2 and don't push the agent to these states in fact? I think that this operation is simlilar to the Monte Carlo Tree Search in AlphaGo. However, I can't solve this problem myself. I sincerely ask for your help. eg: import deepmind_lab env = deepmind_lab.Lab(level_name, observation_format, string_args, renderer) reward=env.step() In fact, I can't find the step function. So I can't try to add a new function which can return the next state.

sjtuytc commented 5 years ago

I guess the easiest thing is to consider use a workaround. For example, 1) create another environment and perform actions ahead. 2) adjust your algorithm so that only previous state and the state before previous state are used.

tkoeppe commented 4 years ago

The step Python function is implemented in the Python extension here: https://github.com/deepmind/lab/blob/master/python/dmlab_module.c#L379-L431

However, I'm not sure if that helps; you're essentially asking for "save and rewind" semantics, which I don't think we can provide in a any obvious way. That is, we can't "save the state of the world" and then resume from it later; at least I wouldn't know how to make ioq3 do that.

google-deepmind / lab

A question about the step function #143