jurgisp / memory-maze

Evaluating long-term memory of reinforcement learning algorithms
MIT License
129 stars 13 forks source link

env step time #26

Closed Howuhh closed 1 year ago

Howuhh commented 1 year ago

Hi! Just want to double check. I noticed that rollouts are a lot longer than I expected and created a simple test with random actions:

import gym
import memory_maze
from tqdm.auto import trange

def rollout(env):
    done = False
    obs = env.reset()
    total_reward = 0.0
    while not done:
        obs, reward, done, _ = env.step(env.action_space.sample())
        total_reward += reward
    return total_reward

env = gym.make("MemoryMaze-9x9-v0")

for _ in trange(10):
    rollout(env)

On my M1 this will take almost 2 minutes just for 10 random rollouts. Is it normal? Because it's very slow (~80fps).

jurgisp commented 1 year ago

Yes, ~80 fps sounds reasonable. That comes down to MuJoCo engine, see e.g. #openai/mujoco-py/issues/277

The way to speed it up for larger-scale RL training is to run multiple environment workers in separate processes.

jurgisp commented 1 year ago

Also note: there are different MuJoCo rendering settings so you can try setting MUJOCO_GL=glfw vs egl vs osmesa and see if it makes a difference. EGL is supposed to be the fastest, if supported, and we are defaulting to that.