isaac-sim / OmniIsaacGymEnvs

Reinforcement Learning Environments for Omniverse Isaac Gym
Other
844 stars 218 forks source link

6x slower compared to IsaacGymEnvs #12

Open imoneoi opened 2 years ago

imoneoi commented 2 years ago

Why is OmniIsaacGymEnvs (Omniverse Isaac Sim) version about 6x slower, compared with IsaacGymEnvs (IsaacGym Preview) version, with same hardware?

Tested on Humanoid environment, headless, with 8192 environments (RTX 3080 Laptop)

IsaacGymEnvs TPS: ~260k OmniIsaacGymEnvs TPS: ~40k

It is also observed that OmniIsaacGymEnvs have only 5% GPU utilization, while IsaacGymEnvs reaches 70% GPU utilization

Test code for OmniIsaacGymEnvs

PYTHON_PATH scripts/random_policy.py task=Humanoid headless=True task.env.numEnvs=8192

(random_policy.py modified, added a TPS counter like below)

Test code for IsaacGymEnvs

import isaacgym
import isaacgymenvs
import torch

import time

device = "cuda:0"
env_num = 8192
total_steps = int(1e6)

envs = isaacgymenvs.make(
    seed=0, 
    task="Humanoid", 
    num_envs=env_num, 
    sim_device=device,
    rl_device=device,
    headless=True
)
print("Observation space is", envs.observation_space)
print("Action space is", envs.action_space)

last_time = time.time()
last_steps = 0

obs = envs.reset()
for _ in range(total_steps):
    obs, reward, done, info = envs.step(torch.rand((env_num,)+envs.action_space.shape, device=device))

    last_steps += 1
    if last_steps % 100 == 0:
        cur_time = time.time()
        tps = last_steps * env_num / (cur_time - last_time)

        last_time = cur_time
        last_steps = 0

        print ("TPS {:.1f}".format(tps))
kellyguo11 commented 2 years ago

Hi @imoneoi ,

Thanks for sharing the benchmarks. In random_policy.py, the actions are actually sampled on CPU individually for each environment, which may be contributing to the slow performance. When we tested training for the Humanoid environment with rlgames_train.py compared with training in IsaacGymEnvs, we actually observed higher fps in OmniIsaacGymEnvs.

romesco commented 1 year ago

As @kellyguo11 mentioned, if you replace line 62 with:

actions = torch.rand((cfg.task.env.numEnvs, env.action_space.shape[0]), device=task.rl_device)

you should get a TPS (according to your definition in the above example) of more like ~500k.

gotta be careful with sampling on cpu :call_me_hand: