pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.19k stars 289 forks source link

[Feature,Doc] `get_stateful_net` and document MARL initialization #2309

Closed vmoens closed 1 month ago

vmoens commented 1 month ago

Stack from ghstack (oldest at bottom):

pytorch-bot[bot] commented 1 month ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2309

Note: Links to docs will display an error until the docs builds have been completed.

:x: 2 New Failures, 1 Pending, 3 Unrelated Failures

As of commit 02e9f05e01c7308fc32bce77fa7b06f7d5976559 with merge base 59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad (image):

NEW FAILURES - The following jobs have failed:

* [Examples Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2309#27794788029) ([gh](https://github.com/pytorch/rl/actions/runs/10056294444/job/27794788029)) `RuntimeError: Command docker exec -t e9f1e6e8aa955910c09ac75d4cfea668c9cc57c1eba0e8d2557655ced9ae188f /exec failed with exit code 1` * [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2309#27794787447) ([gh](https://github.com/pytorch/rl/actions/runs/10056294447/job/27794787447)) `RuntimeError: Command docker exec -t 4732eb21679facc86defe8ee21f37d3d7f1fbafe72c2f5e570994a3d934b505e /exec failed with exit code 139`

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

* [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2309#27794795017) ([gh](https://github.com/pytorch/rl/actions/runs/10056294469/job/27794795017)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad#27753415500)) `AttributeError: module 'torch' has no attribute 'compiler'` * [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2309#27794791475) ([gh](https://github.com/pytorch/rl/actions/runs/10056294419/job/27794791475)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad#27753414102)) `AttributeError: module 'torch' has no attribute 'compiler'` * [Unit-tests on Windows / unittests-cpu / windows-job](https://hud.pytorch.org/pr/pytorch/rl/2309#27794787317) ([gh](https://github.com/pytorch/rl/actions/runs/10056294446/job/27794787317)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad#27753409046)) `test/test_transforms.py::TestActionDiscretizer::test_trans_parallel_env_check[False]`

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions[bot] commented 1 month ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_single | 59.5294ms | 58.1638ms | 17.1928 Ops/s | 16.9926 Ops/s | $\color{#35bf28}+1.18\\%$ | | test_sync | 39.7765ms | 33.1141ms | 30.1986 Ops/s | 31.8277 Ops/s | $\textbf{\color{#d91a1a}-5.12\\%}$ | | test_async | 52.5918ms | 30.2620ms | 33.0447 Ops/s | 32.7252 Ops/s | $\color{#35bf28}+0.98\\%$ | | test_simple | 0.4986s | 0.4185s | 2.3895 Ops/s | 2.3951 Ops/s | $\color{#d91a1a}-0.23\\%$ | | test_transformed | 0.6401s | 0.5757s | 1.7370 Ops/s | 1.7179 Ops/s | $\color{#35bf28}+1.11\\%$ | | test_serial | 1.3351s | 1.2666s | 0.7895 Ops/s | 0.7823 Ops/s | $\color{#35bf28}+0.92\\%$ | | test_parallel | 1.1768s | 1.1094s | 0.9014 Ops/s | 0.8916 Ops/s | $\color{#35bf28}+1.10\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.2120ms | 25.4922μs | 39.2276 KOps/s | 39.1991 KOps/s | $\color{#35bf28}+0.07\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 69.0890μs | 14.8028μs | 67.5546 KOps/s | 66.6443 KOps/s | $\color{#35bf28}+1.37\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 59.0990μs | 14.7928μs | 67.6003 KOps/s | 67.9174 KOps/s | $\color{#d91a1a}-0.47\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 56.4550μs | 8.6775μs | 115.2408 KOps/s | 115.6538 KOps/s | $\color{#d91a1a}-0.36\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 63.6980μs | 27.3451μs | 36.5696 KOps/s | 36.6330 KOps/s | $\color{#d91a1a}-0.17\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 56.5350μs | 16.4046μs | 60.9585 KOps/s | 60.6471 KOps/s | $\color{#35bf28}+0.51\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 59.9510μs | 16.4841μs | 60.6647 KOps/s | 61.2996 KOps/s | $\color{#d91a1a}-1.04\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 37.9710μs | 10.2183μs | 97.8636 KOps/s | 97.6378 KOps/s | $\color{#35bf28}+0.23\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 81.3110μs | 29.0512μs | 34.4220 KOps/s | 34.5920 KOps/s | $\color{#d91a1a}-0.49\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 65.5010μs | 18.0630μs | 55.3618 KOps/s | 54.9202 KOps/s | $\color{#35bf28}+0.80\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 94.0140μs | 16.4750μs | 60.6981 KOps/s | 61.3307 KOps/s | $\color{#d91a1a}-1.03\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 36.1070μs | 10.1337μs | 98.6807 KOps/s | 98.5260 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 87.7230μs | 30.6107μs | 32.6684 KOps/s | 33.1141 KOps/s | $\color{#d91a1a}-1.35\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 77.7350μs | 19.4111μs | 51.5169 KOps/s | 51.1443 KOps/s | $\color{#35bf28}+0.73\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 52.6780μs | 17.9108μs | 55.8323 KOps/s | 56.2456 KOps/s | $\color{#d91a1a}-0.73\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 37.8300μs | 11.6322μs | 85.9685 KOps/s | 85.9491 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 79.1670μs | 28.8839μs | 34.6214 KOps/s | 34.4934 KOps/s | $\color{#35bf28}+0.37\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 65.6300μs | 18.0211μs | 55.4906 KOps/s | 54.7391 KOps/s | $\color{#35bf28}+1.37\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 59.7910μs | 18.9390μs | 52.8012 KOps/s | 53.7630 KOps/s | $\color{#d91a1a}-1.79\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 51.7560μs | 11.4831μs | 87.0844 KOps/s | 87.5261 KOps/s | $\color{#d91a1a}-0.50\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 88.9650μs | 30.5764μs | 32.7050 KOps/s | 32.9544 KOps/s | $\color{#d91a1a}-0.76\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 51.3550μs | 19.4851μs | 51.3214 KOps/s | 51.0890 KOps/s | $\color{#35bf28}+0.45\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 70.4610μs | 20.3325μs | 49.1824 KOps/s | 49.4075 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 40.0340μs | 12.9626μs | 77.1451 KOps/s | 78.2573 KOps/s | $\color{#d91a1a}-1.42\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 3.4572ms | 32.1758μs | 31.0793 KOps/s | 31.3062 KOps/s | $\color{#d91a1a}-0.72\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 62.0650μs | 21.2566μs | 47.0443 KOps/s | 47.3919 KOps/s | $\color{#d91a1a}-0.73\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 51.5560μs | 20.4660μs | 48.8616 KOps/s | 49.7348 KOps/s | $\color{#d91a1a}-1.76\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 53.9100μs | 12.9691μs | 77.1065 KOps/s | 77.1584 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 81.5910μs | 33.5015μs | 29.8494 KOps/s | 30.0862 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 59.8010μs | 22.3373μs | 44.7681 KOps/s | 44.5826 KOps/s | $\color{#35bf28}+0.42\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 72.1340μs | 21.6782μs | 46.1294 KOps/s | 46.6762 KOps/s | $\color{#d91a1a}-1.17\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 61.6350μs | 14.1036μs | 70.9038 KOps/s | 70.8108 KOps/s | $\color{#35bf28}+0.13\\%$ | | test_values[generalized_advantage_estimate-True-True] | 11.2332ms | 9.5139ms | 105.1098 Ops/s | 102.8143 Ops/s | $\color{#35bf28}+2.23\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 36.0750ms | 33.4151ms | 29.9266 Ops/s | 28.1273 Ops/s | $\textbf{\color{#35bf28}+6.40\\%}$ | | test_values[td0_return_estimate-False-False] | 0.2050ms | 0.1692ms | 5.9116 KOps/s | 5.4090 KOps/s | $\textbf{\color{#35bf28}+9.29\\%}$ | | test_values[td1_return_estimate-False-False] | 25.1818ms | 23.9696ms | 41.7195 Ops/s | 41.3708 Ops/s | $\color{#35bf28}+0.84\\%$ | | test_values[vec_td1_return_estimate-False-False] | 35.9667ms | 33.4444ms | 29.9004 Ops/s | 28.1034 Ops/s | $\textbf{\color{#35bf28}+6.39\\%}$ | | test_values[td_lambda_return_estimate-True-False] | 37.9214ms | 34.3927ms | 29.0759 Ops/s | 28.9837 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 35.4718ms | 33.3750ms | 29.9625 Ops/s | 28.1391 Ops/s | $\textbf{\color{#35bf28}+6.48\\%}$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.3521ms | 8.1984ms | 121.9743 Ops/s | 120.3235 Ops/s | $\color{#35bf28}+1.37\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.0786ms | 1.8209ms | 549.1892 Ops/s | 496.1004 Ops/s | $\textbf{\color{#35bf28}+10.70\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.6325ms | 0.3669ms | 2.7253 KOps/s | 2.8206 KOps/s | $\color{#d91a1a}-3.38\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 45.4963ms | 42.9703ms | 23.2719 Ops/s | 21.0247 Ops/s | $\textbf{\color{#35bf28}+10.69\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.1636ms | 3.0385ms | 329.1077 Ops/s | 327.5368 Ops/s | $\color{#35bf28}+0.48\\%$ | | test_dqn_speed | 1.9028ms | 1.3847ms | 722.1892 Ops/s | 712.7399 Ops/s | $\color{#35bf28}+1.33\\%$ | | test_ddpg_speed | 4.1829ms | 3.0378ms | 329.1871 Ops/s | 339.2957 Ops/s | $\color{#d91a1a}-2.98\\%$ | | test_sac_speed | 8.9738ms | 8.5558ms | 116.8791 Ops/s | 116.9865 Ops/s | $\color{#d91a1a}-0.09\\%$ | | test_redq_speed | 14.5181ms | 13.3956ms | 74.6512 Ops/s | 70.2966 Ops/s | $\textbf{\color{#35bf28}+6.19\\%}$ | | test_redq_deprec_speed | 14.7689ms | 13.3401ms | 74.9620 Ops/s | 72.7360 Ops/s | $\color{#35bf28}+3.06\\%$ | | test_td3_speed | 8.8186ms | 8.4974ms | 117.6836 Ops/s | 115.2011 Ops/s | $\color{#35bf28}+2.15\\%$ | | test_cql_speed | 38.8027ms | 37.1290ms | 26.9331 Ops/s | 27.0469 Ops/s | $\color{#d91a1a}-0.42\\%$ | | test_a2c_speed | 11.6008ms | 7.5553ms | 132.3583 Ops/s | 129.3056 Ops/s | $\color{#35bf28}+2.36\\%$ | | test_ppo_speed | 9.4454ms | 7.7371ms | 129.2474 Ops/s | 123.5035 Ops/s | $\color{#35bf28}+4.65\\%$ | | test_reinforce_speed | 7.5312ms | 6.6089ms | 151.3100 Ops/s | 146.8739 Ops/s | $\color{#35bf28}+3.02\\%$ | | test_iql_speed | 34.6396ms | 32.7766ms | 30.5096 Ops/s | 30.1995 Ops/s | $\color{#35bf28}+1.03\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.7028ms | 4.8463ms | 206.3419 Ops/s | 194.2457 Ops/s | $\textbf{\color{#35bf28}+6.23\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7897ms | 0.4849ms | 2.0624 KOps/s | 2.0184 KOps/s | $\color{#35bf28}+2.18\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7115ms | 0.4563ms | 2.1917 KOps/s | 2.1226 KOps/s | $\color{#35bf28}+3.25\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 7.5333ms | 4.8522ms | 206.0932 Ops/s | 197.0655 Ops/s | $\color{#35bf28}+4.58\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7477ms | 0.4841ms | 2.0659 KOps/s | 2.0685 KOps/s | $\color{#d91a1a}-0.13\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7147ms | 0.4557ms | 2.1945 KOps/s | 2.1746 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.1816ms | 1.7129ms | 583.8073 Ops/s | 584.0282 Ops/s | $\color{#d91a1a}-0.04\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 8.6099ms | 1.6325ms | 612.5433 Ops/s | 617.8404 Ops/s | $\color{#d91a1a}-0.86\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.5946ms | 4.9912ms | 200.3511 Ops/s | 187.1058 Ops/s | $\textbf{\color{#35bf28}+7.08\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.9992ms | 0.6244ms | 1.6015 KOps/s | 1.5502 KOps/s | $\color{#35bf28}+3.31\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9356ms | 0.5970ms | 1.6751 KOps/s | 1.3860 KOps/s | $\textbf{\color{#35bf28}+20.86\\%}$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.4540ms | 5.0337ms | 198.6608 Ops/s | 191.7094 Ops/s | $\color{#35bf28}+3.63\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.5871ms | 0.4941ms | 2.0237 KOps/s | 2.0160 KOps/s | $\color{#35bf28}+0.38\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 7.2613ms | 0.4808ms | 2.0797 KOps/s | 2.1451 KOps/s | $\color{#d91a1a}-3.05\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.5181ms | 4.9836ms | 200.6574 Ops/s | 201.3924 Ops/s | $\color{#d91a1a}-0.36\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8463ms | 0.4888ms | 2.0460 KOps/s | 2.0660 KOps/s | $\color{#d91a1a}-0.97\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6010ms | 0.4573ms | 2.1870 KOps/s | 2.1450 KOps/s | $\color{#35bf28}+1.96\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.2496ms | 5.0877ms | 196.5539 Ops/s | 191.3946 Ops/s | $\color{#35bf28}+2.70\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.4467ms | 0.6219ms | 1.6080 KOps/s | 1.5972 KOps/s | $\color{#35bf28}+0.68\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.1210s | 0.7629ms | 1.3107 KOps/s | 1.6664 KOps/s | $\textbf{\color{#d91a1a}-21.34\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1101s | 6.0441ms | 165.4497 Ops/s | 154.1571 Ops/s | $\textbf{\color{#35bf28}+7.33\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 16.6072ms | 12.9067ms | 77.4793 Ops/s | 77.0901 Ops/s | $\color{#35bf28}+0.50\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 4.9624ms | 1.1864ms | 842.8817 Ops/s | 844.3264 Ops/s | $\color{#d91a1a}-0.17\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1090s | 5.9524ms | 168.0001 Ops/s | 125.7774 Ops/s | $\textbf{\color{#35bf28}+33.57\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 16.7322ms | 12.8647ms | 77.7321 Ops/s | 76.3643 Ops/s | $\color{#35bf28}+1.79\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 6.5284ms | 1.2127ms | 824.5763 Ops/s | 867.5911 Ops/s | $\color{#d91a1a}-4.96\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1092s | 8.1141ms | 123.2423 Ops/s | 165.9249 Ops/s | $\textbf{\color{#d91a1a}-25.72\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 17.3747ms | 13.1101ms | 76.2770 Ops/s | 72.4835 Ops/s | $\textbf{\color{#35bf28}+5.23\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.0756ms | 1.2974ms | 770.7901 Ops/s | 713.7752 Ops/s | $\textbf{\color{#35bf28}+7.99\\%}$ |
github-actions[bot] commented 1 month ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | -------------- | ------------------ | ----------------------------------- | | test_single | 0.1084s | 0.1083s | 9.2301 Ops/s | 8.5313 Ops/s | $\textbf{\color{#35bf28}+8.19\\%}$ | | test_sync | 94.7664ms | 93.3604ms | 10.7112 Ops/s | 10.7182 Ops/s | $\color{#d91a1a}-0.07\\%$ | | test_async | 0.1760s | 90.0845ms | 11.1007 Ops/s | 11.3182 Ops/s | $\color{#d91a1a}-1.92\\%$ | | test_single_pixels | 0.1185s | 0.1183s | 8.4546 Ops/s | 8.5246 Ops/s | $\color{#d91a1a}-0.82\\%$ | | test_sync_pixels | 77.8953ms | 74.6696ms | 13.3923 Ops/s | 13.6277 Ops/s | $\color{#d91a1a}-1.73\\%$ | | test_async_pixels | 0.1436s | 69.2078ms | 14.4492 Ops/s | 14.2018 Ops/s | $\color{#35bf28}+1.74\\%$ | | test_simple | 0.8746s | 0.7993s | 1.2511 Ops/s | 1.2517 Ops/s | $\color{#d91a1a}-0.05\\%$ | | test_transformed | 1.1104s | 1.0394s | 0.9621 Ops/s | 0.9897 Ops/s | $\color{#d91a1a}-2.79\\%$ | | test_serial | 2.3496s | 2.2769s | 0.4392 Ops/s | 0.4448 Ops/s | $\color{#d91a1a}-1.26\\%$ | | test_parallel | 2.0458s | 1.9717s | 0.5072 Ops/s | 0.5030 Ops/s | $\color{#35bf28}+0.82\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.1087ms | 38.5444μs | 25.9441 KOps/s | 26.2344 KOps/s | $\color{#d91a1a}-1.11\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 0.1614ms | 21.3388μs | 46.8630 KOps/s | 46.5921 KOps/s | $\color{#35bf28}+0.58\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 53.1910μs | 21.2172μs | 47.1315 KOps/s | 47.0230 KOps/s | $\color{#35bf28}+0.23\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 35.7110μs | 12.1264μs | 82.4645 KOps/s | 81.7270 KOps/s | $\color{#35bf28}+0.90\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 74.0210μs | 39.6764μs | 25.2039 KOps/s | 24.5521 KOps/s | $\color{#35bf28}+2.65\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 52.7400μs | 23.8175μs | 41.9859 KOps/s | 41.7802 KOps/s | $\color{#35bf28}+0.49\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 50.0000μs | 23.7905μs | 42.0336 KOps/s | 42.5878 KOps/s | $\color{#d91a1a}-1.30\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 45.0500μs | 14.4057μs | 69.4167 KOps/s | 68.3639 KOps/s | $\color{#35bf28}+1.54\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 80.8410μs | 42.9861μs | 23.2634 KOps/s | 23.1478 KOps/s | $\color{#35bf28}+0.50\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 52.8910μs | 26.1889μs | 38.1841 KOps/s | 37.6354 KOps/s | $\color{#35bf28}+1.46\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 51.8210μs | 23.4452μs | 42.6526 KOps/s | 41.6087 KOps/s | $\color{#35bf28}+2.51\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 38.6610μs | 14.4401μs | 69.2515 KOps/s | 68.1900 KOps/s | $\color{#35bf28}+1.56\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 75.6320μs | 44.3370μs | 22.5545 KOps/s | 22.0654 KOps/s | $\color{#35bf28}+2.22\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 64.3610μs | 28.3151μs | 35.3169 KOps/s | 34.9334 KOps/s | $\color{#35bf28}+1.10\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 56.3310μs | 25.8767μs | 38.6448 KOps/s | 38.8980 KOps/s | $\color{#d91a1a}-0.65\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 38.0600μs | 16.6904μs | 59.9148 KOps/s | 58.6114 KOps/s | $\color{#35bf28}+2.22\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 82.8020μs | 43.0206μs | 23.2447 KOps/s | 23.5679 KOps/s | $\color{#d91a1a}-1.37\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 56.4310μs | 25.9856μs | 38.4829 KOps/s | 38.3129 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 55.7810μs | 27.9721μs | 35.7499 KOps/s | 35.6087 KOps/s | $\color{#35bf28}+0.40\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 52.4600μs | 16.4052μs | 60.9562 KOps/s | 60.5576 KOps/s | $\color{#35bf28}+0.66\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 84.8810μs | 44.1322μs | 22.6592 KOps/s | 22.1275 KOps/s | $\color{#35bf28}+2.40\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 0.2088ms | 28.3285μs | 35.3001 KOps/s | 34.9831 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 58.8510μs | 30.2995μs | 33.0038 KOps/s | 33.0221 KOps/s | $\color{#d91a1a}-0.06\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 36.5710μs | 19.1462μs | 52.2297 KOps/s | 53.4965 KOps/s | $\color{#d91a1a}-2.37\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 3.8724ms | 48.3868μs | 20.6668 KOps/s | 20.8989 KOps/s | $\color{#d91a1a}-1.11\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 52.2510μs | 31.0303μs | 32.2266 KOps/s | 32.2489 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 52.8610μs | 30.7993μs | 32.4682 KOps/s | 32.2742 KOps/s | $\color{#35bf28}+0.60\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 43.7500μs | 18.7234μs | 53.4091 KOps/s | 52.6713 KOps/s | $\color{#35bf28}+1.40\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 74.6010μs | 48.8072μs | 20.4888 KOps/s | 20.2467 KOps/s | $\color{#35bf28}+1.20\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 61.9510μs | 33.1913μs | 30.1284 KOps/s | 30.4322 KOps/s | $\color{#d91a1a}-1.00\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 62.2910μs | 31.8105μs | 31.4361 KOps/s | 30.6498 KOps/s | $\color{#35bf28}+2.57\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 38.7310μs | 20.8086μs | 48.0570 KOps/s | 47.3246 KOps/s | $\color{#35bf28}+1.55\\%$ | | test_values[generalized_advantage_estimate-True-True] | 25.5968ms | 24.9766ms | 40.0375 Ops/s | 38.8357 Ops/s | $\color{#35bf28}+3.09\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 89.7241ms | 2.6977ms | 370.6891 Ops/s | 366.4980 Ops/s | $\color{#35bf28}+1.14\\%$ | | test_values[td0_return_estimate-False-False] | 92.0320μs | 67.0998μs | 14.9032 KOps/s | 14.9582 KOps/s | $\color{#d91a1a}-0.37\\%$ | | test_values[td1_return_estimate-False-False] | 56.6947ms | 55.8385ms | 17.9088 Ops/s | 17.6240 Ops/s | $\color{#35bf28}+1.62\\%$ | | test_values[vec_td1_return_estimate-False-False] | 1.3165ms | 1.0922ms | 915.5466 Ops/s | 915.1097 Ops/s | $\color{#35bf28}+0.05\\%$ | | test_values[td_lambda_return_estimate-True-False] | 88.5087ms | 88.1757ms | 11.3410 Ops/s | 10.9838 Ops/s | $\color{#35bf28}+3.25\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 1.3177ms | 1.0879ms | 919.1849 Ops/s | 918.5374 Ops/s | $\color{#35bf28}+0.07\\%$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 24.8786ms | 24.6384ms | 40.5871 Ops/s | 38.0187 Ops/s | $\textbf{\color{#35bf28}+6.76\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9511ms | 0.7277ms | 1.3742 KOps/s | 1.3521 KOps/s | $\color{#35bf28}+1.64\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8216ms | 0.6797ms | 1.4713 KOps/s | 1.4335 KOps/s | $\color{#35bf28}+2.64\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5054ms | 1.4735ms | 678.6782 Ops/s | 680.0301 Ops/s | $\color{#d91a1a}-0.20\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7519ms | 0.6932ms | 1.4426 KOps/s | 1.4534 KOps/s | $\color{#d91a1a}-0.74\\%$ | | test_dqn_speed | 7.1858ms | 1.4878ms | 672.1401 Ops/s | 682.9593 Ops/s | $\color{#d91a1a}-1.58\\%$ | | test_ddpg_speed | 3.2989ms | 3.0448ms | 328.4307 Ops/s | 334.1900 Ops/s | $\color{#d91a1a}-1.72\\%$ | | test_sac_speed | 0.1015s | 9.3013ms | 107.5123 Ops/s | 118.6346 Ops/s | $\textbf{\color{#d91a1a}-9.38\\%}$ | | test_redq_speed | 12.0727ms | 11.1159ms | 89.9611 Ops/s | 89.7425 Ops/s | $\color{#35bf28}+0.24\\%$ | | test_redq_deprec_speed | 12.6294ms | 11.9205ms | 83.8894 Ops/s | 86.6072 Ops/s | $\color{#d91a1a}-3.14\\%$ | | test_td3_speed | 8.7753ms | 8.5501ms | 116.9571 Ops/s | 118.4727 Ops/s | $\color{#d91a1a}-1.28\\%$ | | test_cql_speed | 27.8613ms | 26.7232ms | 37.4207 Ops/s | 34.6200 Ops/s | $\textbf{\color{#35bf28}+8.09\\%}$ | | test_a2c_speed | 6.7014ms | 5.9537ms | 167.9627 Ops/s | 176.7448 Ops/s | $\color{#d91a1a}-4.97\\%$ | | test_ppo_speed | 7.0572ms | 6.2834ms | 159.1505 Ops/s | 165.0143 Ops/s | $\color{#d91a1a}-3.55\\%$ | | test_reinforce_speed | 5.6862ms | 4.7931ms | 208.6349 Ops/s | 213.6810 Ops/s | $\color{#d91a1a}-2.36\\%$ | | test_iql_speed | 21.6348ms | 20.6640ms | 48.3933 Ops/s | 49.0638 Ops/s | $\color{#d91a1a}-1.37\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.8929ms | 6.7319ms | 148.5463 Ops/s | 148.9328 Ops/s | $\color{#d91a1a}-0.26\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.1123s | 0.6105ms | 1.6379 KOps/s | 1.9169 KOps/s | $\textbf{\color{#d91a1a}-14.55\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6625ms | 0.5061ms | 1.9760 KOps/s | 1.9692 KOps/s | $\color{#35bf28}+0.35\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.9647ms | 6.6299ms | 150.8322 Ops/s | 151.6343 Ops/s | $\color{#d91a1a}-0.53\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7964ms | 0.5190ms | 1.9269 KOps/s | 1.9265 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6624ms | 0.4999ms | 2.0003 KOps/s | 2.0045 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.8176ms | 2.0457ms | 488.8287 Ops/s | 499.8710 Ops/s | $\color{#d91a1a}-2.21\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 2.2490ms | 1.9670ms | 508.3897 Ops/s | 523.5430 Ops/s | $\color{#d91a1a}-2.89\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.0002ms | 6.8511ms | 145.9624 Ops/s | 146.5347 Ops/s | $\color{#d91a1a}-0.39\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.1289s | 0.7877ms | 1.2696 KOps/s | 1.5001 KOps/s | $\textbf{\color{#d91a1a}-15.37\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8530ms | 0.6576ms | 1.5207 KOps/s | 1.5438 KOps/s | $\color{#d91a1a}-1.50\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.8906ms | 6.7271ms | 148.6524 Ops/s | 149.3827 Ops/s | $\color{#d91a1a}-0.49\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.6003ms | 0.5226ms | 1.9134 KOps/s | 1.9071 KOps/s | $\color{#35bf28}+0.33\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6645ms | 0.5063ms | 1.9751 KOps/s | 1.9650 KOps/s | $\color{#35bf28}+0.51\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.9150ms | 6.6231ms | 150.9872 Ops/s | 150.3484 Ops/s | $\color{#35bf28}+0.42\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8948ms | 0.5189ms | 1.9272 KOps/s | 1.9320 KOps/s | $\color{#d91a1a}-0.25\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6475ms | 0.5055ms | 1.9783 KOps/s | 2.0276 KOps/s | $\color{#d91a1a}-2.43\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.0798ms | 6.8788ms | 145.3737 Ops/s | 145.4581 Ops/s | $\color{#d91a1a}-0.06\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.5212ms | 0.6789ms | 1.4729 KOps/s | 1.4931 KOps/s | $\color{#d91a1a}-1.35\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7980ms | 0.6589ms | 1.5177 KOps/s | 1.5188 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1470s | 8.0759ms | 123.8253 Ops/s | 126.1147 Ops/s | $\color{#d91a1a}-1.82\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 19.4481ms | 16.4339ms | 60.8500 Ops/s | 61.5219 Ops/s | $\color{#d91a1a}-1.09\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 2.4376ms | 1.2710ms | 786.8115 Ops/s | 745.9826 Ops/s | $\textbf{\color{#35bf28}+5.47\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1236s | 7.6808ms | 130.1944 Ops/s | 131.4223 Ops/s | $\color{#d91a1a}-0.93\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1379s | 18.7847ms | 53.2349 Ops/s | 61.6346 Ops/s | $\textbf{\color{#d91a1a}-13.63\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.3618ms | 1.2733ms | 785.3306 Ops/s | 735.4987 Ops/s | $\textbf{\color{#35bf28}+6.78\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1230s | 7.7912ms | 128.3505 Ops/s | 128.0652 Ops/s | $\color{#35bf28}+0.22\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 19.5250ms | 16.7084ms | 59.8500 Ops/s | 61.1821 Ops/s | $\color{#d91a1a}-2.18\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.6036ms | 1.4381ms | 695.3840 Ops/s | 718.4847 Ops/s | $\color{#d91a1a}-3.22\\%$ |