Toni-SM / skrl

Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab
https://skrl.readthedocs.io/
MIT License
518 stars 47 forks source link

Wall clock time in Isaac Gym benchmarks? #41

Closed ArthurAllshire closed 1 year ago

ArthurAllshire commented 1 year ago

NVIDIA Isaac Gym

Environment PPO
Allegro Hand 1 3942.69
Ant 5466.3 +/- 279.61
Anymal 61.86 +/- 1.81
Anymal Terrain 19.82 +/- 0.57
Ball Balance 288.07 +/- 25.54
Cartpole 2 494.34 +/- 0.87
Franka Cabinet 3134.0 +/- 182.99
Humanoid 6474.34 +/- 696.27
Ingenuity 7066.82 +/- 488.97
Quadcopter 1237.75 +/- 127.05
Shadow Hand 7898.38 +/- 180.75
Environment AMP
Humanoid 295.65 +/- 0.86

The following charts show the episode's mean length in timesteps (left) and the mean total reward (right)

Allegro Hand (PPO) AllegroHand

Ant (PPO) Ant

Anymal (PPO) Anymal

Anymal Terrain (PPO) AnymalTerrain

Ball Balance (PPO) BallBalance

Cartpole (PPO) Cartpole

Franka Cabinet (PPO) FrankaCabinet

Humanoid (PPO) Humanoid

Humanoid (AMP: imitate different pre-recorded human animations) HumanoidAMP

Ingenuity (PPO) Ingenuity

Quadcopter (PPO) Quadcopter

Shadow Hand (PPO) ShadowHand

Originally posted by @Toni-SM in https://github.com/Toni-SM/skrl/discussions/32#discussioncomment-3774815

Hi, I was looking through the benchmark results here (above) for Isaac Gym and was wondering if you could provide also wall clock time in them or if you have the info about how long it took to train each of them? Since for Isaac Gym training that is the critical variable for me to understand the performance due to the ability to vary the number of environments. Thanks :)

Toni-SM commented 1 year ago

Hi @ArthurAllshire

I still have to process the data... but in the following file you can find the time measurements of all the experiments shown in the charts. The experiments were performed on a Quadro RTX 6000 (If I remember well)

time.txt

The time is in seconds and it is measured for the whole script execution (env load + learning)

For example:

# ppo_cartpole
for i in {1..10}
do
    start=`date +%s`
    python ppo_cartpole.py headless=True
    end=`date +%s`
    echo ppo_cartpole $((end-start)) >> time.txt
done