Closed snailrowen1337 closed 3 years ago
After looking at https://github.com/deepmind/dm_control/issues/64, it seems like there might be issues with the warmstart buffer. So instead of manually stepping with the environment, consider the following example where I create two separate environments, and render each of them twice
from dm_control import suite
def create():
env = suite.load(domain_name='cartpole', task_name='swingup')
state = np.array([1.3, 5.3, 0.1, 2.3])
env.reset()
phys = env.physics
phys.set_state(state)
obs1 = phys.render()
obs2 = phys.render()
h = lambda img : hash(img.data.tobytes())
print('>>>> should be equal', h(obs1), h(obs2))
create()
create()
When doing this I get
>>>> should be equal -6963634081576593535 -6963634081576593535
>>>> should be equal -9021884202509515482 -9021884202509515482
So given an environment, the rendering seems deterministic. But creating two environments and setting their states to the same value does not give deterministic rendering with OSmesa. I was hoping that this setup would eliminate issues with the warmstart buffer. Am I misunderstanding something here? Thanks!!
Sorry, this seems to be resolved once I set seeds properly. The original culprit seems to have been the warmstart buffer.
Ok, so when setting the seeds properly, the rendering is consistent within a process. But not between processes. Consider the following code:
from dm_control import suite
def create():
env = suite.load(domain_name='cartpole', task_name='swingup', task_kwargs={'random': 32})
state = np.array([1.3, 5.3, 0.1, 2.3])
action = np.array([0.3])
phys = env.physics
env.reset()
phys.set_state(state)
env.step(action)
obs1 = phys.render()
obs2 = phys.render()
h = lambda img : hash(img.data.tobytes())
print('>>>> should be equal', h(obs1), h(obs2))
create()
create()
If I run this once with e.g. python test.py
, the results are
>>>> should be equal 8297706552909453184 8297706552909453184
>>>> should be equal 8297706552909453184 8297706552909453184
so far so good. But if I run it again, I get:
>>>> should be equal -922062236587031648 -922062236587031648
>>>> should be equal -922062236587031648 -922062236587031648
Any ideas what might be going on here? Thanks!
Python hashing does not seem to be consistent across processes. This is resolved!
I am running deepmind control and rendering with OSmesa. When setting the states, the rendered results are not consistent. Consider the example below, where I manually set the state before and after taking an action. Since I set the state, I would expect the same output from the rendered:
However, the rendered images are not the same and I get:
When I comment out
env.step(action)
, I get consistent results, so it's not an issue with the rendering itself:Is there an issue with my installation, or am I misunderstanding the semantics?