pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.24k stars 296 forks source link

[WIP] AlphaZero #2246

Open vmoens opened 3 months ago

pytorch-bot[bot] commented 3 months ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2246

Note: Links to docs will display an error until the docs builds have been completed.

:x: 4 New Failures, 1 Unrelated Failure

As of commit 3f4c3920932a6cee0feb6cf3892f743a74e3c88f with merge base 00b7c2e8f38730747f9484e7fa3b763e509cc914 (image):

NEW FAILURES - The following jobs have failed:

* [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2246#26615074189) ([gh](https://github.com/pytorch/rl/actions/runs/9650113684/job/26615074189)) `RuntimeError: Command docker exec -t d254bac06cc5f23c8a246785770cc7029a288e71553eeea05223c8aa93f5ae75 /exec failed with exit code 139` * [Lint / python-source-and-configs / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2246#26615074919) ([gh](https://github.com/pytorch/rl/actions/runs/9650113685/job/26615074919)) `sota-implementations/MCTS/AlphaZero/mcts_policy.py:8:1: F401 'typing.Optional' imported but unused` * [Unit-tests on Linux / tests-cpu (3.8) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2246#26615076023) ([gh](https://github.com/pytorch/rl/actions/runs/9650113691/job/26615076023)) `RuntimeError: Command docker exec -t 4fb1e7fcae83bda05528f6c6bdc539d921f282c04c35bd9736970ac0d81e5dcc /exec failed with exit code 1` * [Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2246#26615077933) ([gh](https://github.com/pytorch/rl/actions/runs/9650113691/job/26615077933)) `RuntimeError: Command docker exec -t 1a6e8cc39a192dd5694ea261af62b773ba66717f13159095f4ebdd44e7dcb55c /exec failed with exit code 1`

FLAKY - The following job failed but was likely due to flakiness present on trunk:

* [Unit-tests on Windows / unittests-cpu / windows-job](https://hud.pytorch.org/pr/pytorch/rl/2246#26615078632) ([gh](https://github.com/pytorch/rl/actions/runs/9650113690/job/26615078632)) (detected as infra flaky with no runner)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions[bot] commented 3 months ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_single | 0.1033s | 58.1083ms | 17.2092 Ops/s | 18.2327 Ops/s | $\textbf{\color{#d91a1a}-5.61\\%}$ | | test_sync | 31.3035ms | 30.6002ms | 32.6795 Ops/s | 30.1018 Ops/s | $\textbf{\color{#35bf28}+8.56\\%}$ | | test_async | 52.9567ms | 28.0451ms | 35.6568 Ops/s | 34.4302 Ops/s | $\color{#35bf28}+3.56\\%$ | | test_simple | 0.3786s | 0.3743s | 2.6718 Ops/s | 2.6502 Ops/s | $\color{#35bf28}+0.82\\%$ | | test_transformed | 0.5310s | 0.5277s | 1.8950 Ops/s | 1.8909 Ops/s | $\color{#35bf28}+0.22\\%$ | | test_serial | 1.3119s | 1.2516s | 0.7990 Ops/s | 0.7906 Ops/s | $\color{#35bf28}+1.06\\%$ | | test_parallel | 1.1371s | 1.0620s | 0.9416 Ops/s | 0.9246 Ops/s | $\color{#35bf28}+1.84\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.1149ms | 22.9454μs | 43.5817 KOps/s | 44.1616 KOps/s | $\color{#d91a1a}-1.31\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 47.4180μs | 13.4816μs | 74.1750 KOps/s | 75.3940 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 66.4030μs | 13.3670μs | 74.8108 KOps/s | 76.6434 KOps/s | $\color{#d91a1a}-2.39\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 43.6820μs | 8.0023μs | 124.9638 KOps/s | 131.7417 KOps/s | $\textbf{\color{#d91a1a}-5.14\\%}$ | | test_step_mdp_speed[True-True-False-True-True] | 78.2580μs | 23.9832μs | 41.6958 KOps/s | 41.5026 KOps/s | $\color{#35bf28}+0.47\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 46.3370μs | 14.5544μs | 68.7078 KOps/s | 68.2954 KOps/s | $\color{#35bf28}+0.60\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 59.8210μs | 14.5271μs | 68.8371 KOps/s | 69.8555 KOps/s | $\color{#d91a1a}-1.46\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 52.9510μs | 9.1487μs | 109.3047 KOps/s | 112.1812 KOps/s | $\color{#d91a1a}-2.56\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 52.1780μs | 25.2661μs | 39.5787 KOps/s | 39.3199 KOps/s | $\color{#35bf28}+0.66\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 0.1653ms | 16.1269μs | 62.0083 KOps/s | 62.3846 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 49.3330μs | 14.5069μs | 68.9326 KOps/s | 69.3058 KOps/s | $\color{#d91a1a}-0.54\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 33.2420μs | 9.0390μs | 110.6314 KOps/s | 111.3067 KOps/s | $\color{#d91a1a}-0.61\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 53.7910μs | 26.5443μs | 37.6728 KOps/s | 37.3144 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 53.1700μs | 17.1671μs | 58.2511 KOps/s | 57.7705 KOps/s | $\color{#35bf28}+0.83\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 63.9700μs | 15.7688μs | 63.4165 KOps/s | 63.4465 KOps/s | $\color{#d91a1a}-0.05\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 35.7270μs | 10.2881μs | 97.1994 KOps/s | 97.4038 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 55.9150μs | 25.6327μs | 39.0126 KOps/s | 39.0183 KOps/s | $\color{#d91a1a}-0.01\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 41.3390μs | 16.1959μs | 61.7438 KOps/s | 61.9235 KOps/s | $\color{#d91a1a}-0.29\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 42.9120μs | 16.8530μs | 59.3366 KOps/s | 59.5686 KOps/s | $\color{#d91a1a}-0.39\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 36.0390μs | 10.4273μs | 95.9017 KOps/s | 97.4792 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 81.4360μs | 26.5649μs | 37.6437 KOps/s | 37.1606 KOps/s | $\color{#35bf28}+1.30\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 44.5050μs | 17.3489μs | 57.6407 KOps/s | 57.6277 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 60.4120μs | 17.8574μs | 55.9991 KOps/s | 55.2864 KOps/s | $\color{#35bf28}+1.29\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 40.1660μs | 11.5735μs | 86.4043 KOps/s | 86.5530 KOps/s | $\color{#d91a1a}-0.17\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 63.8390μs | 27.5853μs | 36.2512 KOps/s | 35.3301 KOps/s | $\color{#35bf28}+2.61\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 46.2670μs | 18.8032μs | 53.1823 KOps/s | 53.0916 KOps/s | $\color{#35bf28}+0.17\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 55.0140μs | 17.9943μs | 55.5732 KOps/s | 55.0043 KOps/s | $\color{#35bf28}+1.03\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 40.0460μs | 11.5345μs | 86.6962 KOps/s | 86.5535 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 55.2930μs | 29.8318μs | 33.5213 KOps/s | 33.1561 KOps/s | $\color{#35bf28}+1.10\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 58.6400μs | 19.8796μs | 50.3029 KOps/s | 50.1460 KOps/s | $\color{#35bf28}+0.31\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 46.7470μs | 19.1362μs | 52.2570 KOps/s | 51.7968 KOps/s | $\color{#35bf28}+0.89\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 33.2830μs | 12.6786μs | 78.8730 KOps/s | 78.8794 KOps/s | $-0.01\\%$ | | test_values[generalized_advantage_estimate-True-True] | 11.1227ms | 9.7801ms | 102.2485 Ops/s | 105.7981 Ops/s | $\color{#d91a1a}-3.36\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 37.1289ms | 34.9225ms | 28.6348 Ops/s | 28.4632 Ops/s | $\color{#35bf28}+0.60\\%$ | | test_values[td0_return_estimate-False-False] | 0.2491ms | 0.1691ms | 5.9153 KOps/s | 5.9316 KOps/s | $\color{#d91a1a}-0.27\\%$ | | test_values[td1_return_estimate-False-False] | 24.9853ms | 24.2000ms | 41.3222 Ops/s | 42.1699 Ops/s | $\color{#d91a1a}-2.01\\%$ | | test_values[vec_td1_return_estimate-False-False] | 36.5362ms | 35.3775ms | 28.2665 Ops/s | 28.0258 Ops/s | $\color{#35bf28}+0.86\\%$ | | test_values[td_lambda_return_estimate-True-False] | 37.6007ms | 34.7167ms | 28.8045 Ops/s | 29.3762 Ops/s | $\color{#d91a1a}-1.95\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 38.5404ms | 35.2380ms | 28.3784 Ops/s | 28.3025 Ops/s | $\color{#35bf28}+0.27\\%$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 10.1852ms | 8.5234ms | 117.3245 Ops/s | 120.8823 Ops/s | $\color{#d91a1a}-2.94\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.2685ms | 1.9847ms | 503.8545 Ops/s | 535.5823 Ops/s | $\textbf{\color{#d91a1a}-5.92\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5096ms | 0.3572ms | 2.7995 KOps/s | 2.7809 KOps/s | $\color{#35bf28}+0.67\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 48.7996ms | 45.8057ms | 21.8314 Ops/s | 21.9269 Ops/s | $\color{#d91a1a}-0.44\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.9747ms | 3.0436ms | 328.5612 Ops/s | 327.4388 Ops/s | $\color{#35bf28}+0.34\\%$ | | test_dqn_speed | 7.6655ms | 1.3744ms | 727.5893 Ops/s | 741.8888 Ops/s | $\color{#d91a1a}-1.93\\%$ | | test_ddpg_speed | 3.2296ms | 2.8861ms | 346.4824 Ops/s | 350.6194 Ops/s | $\color{#d91a1a}-1.18\\%$ | | test_sac_speed | 9.8699ms | 8.5703ms | 116.6824 Ops/s | 117.5430 Ops/s | $\color{#d91a1a}-0.73\\%$ | | test_redq_speed | 91.4962ms | 14.4739ms | 69.0900 Ops/s | 73.0954 Ops/s | $\textbf{\color{#d91a1a}-5.48\\%}$ | | test_redq_deprec_speed | 15.3137ms | 13.7811ms | 72.5630 Ops/s | 71.6445 Ops/s | $\color{#35bf28}+1.28\\%$ | | test_td3_speed | 8.8942ms | 8.5107ms | 117.4985 Ops/s | 117.3009 Ops/s | $\color{#35bf28}+0.17\\%$ | | test_cql_speed | 38.3875ms | 37.1452ms | 26.9214 Ops/s | 27.1535 Ops/s | $\color{#d91a1a}-0.85\\%$ | | test_a2c_speed | 8.1892ms | 7.5061ms | 133.2254 Ops/s | 134.3919 Ops/s | $\color{#d91a1a}-0.87\\%$ | | test_ppo_speed | 9.0099ms | 7.7452ms | 129.1128 Ops/s | 129.9047 Ops/s | $\color{#d91a1a}-0.61\\%$ | | test_reinforce_speed | 7.5812ms | 6.6829ms | 149.6346 Ops/s | 150.4238 Ops/s | $\color{#d91a1a}-0.52\\%$ | | test_iql_speed | 34.3988ms | 32.8374ms | 30.4531 Ops/s | 30.4853 Ops/s | $\color{#d91a1a}-0.11\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.3186ms | 3.4830ms | 287.1050 Ops/s | 283.9056 Ops/s | $\color{#35bf28}+1.13\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7613ms | 0.4990ms | 2.0040 KOps/s | 2.0218 KOps/s | $\color{#d91a1a}-0.88\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.6063ms | 0.4789ms | 2.0883 KOps/s | 2.1230 KOps/s | $\color{#d91a1a}-1.63\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.7025ms | 3.4385ms | 290.8242 Ops/s | 286.1574 Ops/s | $\color{#35bf28}+1.63\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.9317ms | 0.4921ms | 2.0323 KOps/s | 2.0377 KOps/s | $\color{#d91a1a}-0.27\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7143ms | 0.4728ms | 2.1151 KOps/s | 2.0997 KOps/s | $\color{#35bf28}+0.73\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.4616ms | 1.7363ms | 575.9347 Ops/s | 578.5790 Ops/s | $\color{#d91a1a}-0.46\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 2.2609ms | 1.6430ms | 608.6474 Ops/s | 610.8461 Ops/s | $\color{#d91a1a}-0.36\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.6499ms | 3.6236ms | 275.9649 Ops/s | 269.9134 Ops/s | $\color{#35bf28}+2.24\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.3693ms | 0.6376ms | 1.5683 KOps/s | 1.3946 KOps/s | $\textbf{\color{#35bf28}+12.45\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7626ms | 0.6091ms | 1.6417 KOps/s | 1.6338 KOps/s | $\color{#35bf28}+0.49\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.3059ms | 3.4981ms | 285.8700 Ops/s | 274.0453 Ops/s | $\color{#35bf28}+4.31\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.0237ms | 0.5073ms | 1.9711 KOps/s | 2.0010 KOps/s | $\color{#d91a1a}-1.50\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.9804ms | 0.4906ms | 2.0383 KOps/s | 2.1203 KOps/s | $\color{#d91a1a}-3.87\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.1730ms | 3.4670ms | 288.4362 Ops/s | 287.2764 Ops/s | $\color{#35bf28}+0.40\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.5990ms | 0.4940ms | 2.0242 KOps/s | 2.0226 KOps/s | $\color{#35bf28}+0.08\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.5824ms | 0.4775ms | 2.0944 KOps/s | 2.0420 KOps/s | $\color{#35bf28}+2.57\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.2356ms | 3.6450ms | 274.3477 Ops/s | 270.7691 Ops/s | $\color{#35bf28}+1.32\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2489ms | 0.6385ms | 1.5663 KOps/s | 1.5830 KOps/s | $\color{#d91a1a}-1.06\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9466ms | 0.6149ms | 1.6262 KOps/s | 1.6404 KOps/s | $\color{#d91a1a}-0.86\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1152s | 5.9378ms | 168.4129 Ops/s | 167.2652 Ops/s | $\color{#35bf28}+0.69\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1098s | 14.5052ms | 68.9410 Ops/s | 79.6795 Ops/s | $\textbf{\color{#d91a1a}-13.48\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.1869ms | 1.0426ms | 959.1495 Ops/s | 959.1796 Ops/s | $-0.00\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 98.7886ms | 5.6322ms | 177.5489 Ops/s | 129.9883 Ops/s | $\textbf{\color{#35bf28}+36.59\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 14.8416ms | 12.4923ms | 80.0492 Ops/s | 79.6282 Ops/s | $\color{#35bf28}+0.53\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 4.4433ms | 1.1329ms | 882.7173 Ops/s | 904.5528 Ops/s | $\color{#d91a1a}-2.41\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1020s | 5.8186ms | 171.8623 Ops/s | 172.3580 Ops/s | $\color{#d91a1a}-0.29\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 15.3498ms | 12.7444ms | 78.4658 Ops/s | 79.2719 Ops/s | $\color{#d91a1a}-1.02\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 3.8209ms | 1.2711ms | 786.7101 Ops/s | 837.5379 Ops/s | $\textbf{\color{#d91a1a}-6.07\\%}$ |
github-actions[bot] commented 3 months ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | -------------- | ------------------ | ----------------------------------- | | test_single | 0.1766s | 0.1245s | 8.0319 Ops/s | 8.2416 Ops/s | $\color{#d91a1a}-2.54\\%$ | | test_sync | 0.1051s | 0.1025s | 9.7536 Ops/s | 9.6586 Ops/s | $\color{#35bf28}+0.98\\%$ | | test_async | 0.2031s | 81.4717ms | 12.2742 Ops/s | 10.1306 Ops/s | $\textbf{\color{#35bf28}+21.16\\%}$ | | test_single_pixels | 0.1292s | 0.1290s | 7.7506 Ops/s | 7.7934 Ops/s | $\color{#d91a1a}-0.55\\%$ | | test_sync_pixels | 85.7438ms | 84.0180ms | 11.9022 Ops/s | 12.1540 Ops/s | $\color{#d91a1a}-2.07\\%$ | | test_async_pixels | 0.1607s | 69.3815ms | 14.4131 Ops/s | 14.3301 Ops/s | $\color{#35bf28}+0.58\\%$ | | test_simple | 0.8089s | 0.8078s | 1.2379 Ops/s | 1.2317 Ops/s | $\color{#35bf28}+0.50\\%$ | | test_transformed | 1.0816s | 1.0776s | 0.9280 Ops/s | 0.9341 Ops/s | $\color{#d91a1a}-0.65\\%$ | | test_serial | 2.5595s | 2.5014s | 0.3998 Ops/s | 0.3963 Ops/s | $\color{#35bf28}+0.88\\%$ | | test_parallel | 2.4317s | 2.3757s | 0.4209 Ops/s | 0.4218 Ops/s | $\color{#d91a1a}-0.21\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.1380ms | 34.2730μs | 29.1775 KOps/s | 29.6452 KOps/s | $\color{#d91a1a}-1.58\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 0.1081ms | 19.5967μs | 51.0289 KOps/s | 51.2381 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 0.1355ms | 19.2687μs | 51.8975 KOps/s | 52.2426 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 27.2210μs | 11.3246μs | 88.3033 KOps/s | 88.6286 KOps/s | $\color{#d91a1a}-0.37\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 57.7010μs | 36.4881μs | 27.4062 KOps/s | 28.0003 KOps/s | $\color{#d91a1a}-2.12\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 47.8410μs | 21.5493μs | 46.4051 KOps/s | 46.1308 KOps/s | $\color{#35bf28}+0.59\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 39.8110μs | 21.3198μs | 46.9048 KOps/s | 47.0448 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 30.9000μs | 13.2316μs | 75.5765 KOps/s | 74.6290 KOps/s | $\color{#35bf28}+1.27\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 67.2910μs | 37.9888μs | 26.3236 KOps/s | 26.3865 KOps/s | $\color{#d91a1a}-0.24\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 44.1810μs | 23.2903μs | 42.9363 KOps/s | 41.5856 KOps/s | $\color{#35bf28}+3.25\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 46.1710μs | 21.1627μs | 47.2530 KOps/s | 46.9247 KOps/s | $\color{#35bf28}+0.70\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 29.8600μs | 13.1162μs | 76.2414 KOps/s | 74.7627 KOps/s | $\color{#35bf28}+1.98\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 0.1654ms | 39.7105μs | 25.1823 KOps/s | 25.1685 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 63.3210μs | 25.1021μs | 39.8373 KOps/s | 38.5453 KOps/s | $\color{#35bf28}+3.35\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 50.6110μs | 22.8287μs | 43.8045 KOps/s | 43.7416 KOps/s | $\color{#35bf28}+0.14\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 36.7300μs | 14.5805μs | 68.5849 KOps/s | 65.9964 KOps/s | $\color{#35bf28}+3.92\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 0.2099ms | 37.1472μs | 26.9199 KOps/s | 26.6761 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 94.4720μs | 23.3441μs | 42.8373 KOps/s | 42.3656 KOps/s | $\color{#35bf28}+1.11\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 62.2010μs | 25.1303μs | 39.7927 KOps/s | 40.1007 KOps/s | $\color{#d91a1a}-0.77\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 0.2072ms | 14.6407μs | 68.3030 KOps/s | 66.8349 KOps/s | $\color{#35bf28}+2.20\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 0.2218ms | 39.7371μs | 25.1654 KOps/s | 25.2152 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 0.2154ms | 25.0084μs | 39.9865 KOps/s | 39.5839 KOps/s | $\color{#35bf28}+1.02\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 46.3210μs | 27.1258μs | 36.8653 KOps/s | 37.5983 KOps/s | $\color{#d91a1a}-1.95\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 85.9910μs | 16.5653μs | 60.3672 KOps/s | 59.1288 KOps/s | $\color{#35bf28}+2.09\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 78.2220μs | 41.2108μs | 24.2655 KOps/s | 24.0168 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 81.6110μs | 27.0484μs | 36.9708 KOps/s | 36.1795 KOps/s | $\color{#35bf28}+2.19\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 50.7810μs | 26.7733μs | 37.3507 KOps/s | 38.1146 KOps/s | $\color{#d91a1a}-2.00\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 42.2710μs | 16.3701μs | 61.0870 KOps/s | 59.0097 KOps/s | $\color{#35bf28}+3.52\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 79.0020μs | 43.7251μs | 22.8702 KOps/s | 22.8522 KOps/s | $\color{#35bf28}+0.08\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 69.7210μs | 29.0989μs | 34.3655 KOps/s | 33.9767 KOps/s | $\color{#35bf28}+1.14\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 53.7910μs | 28.7245μs | 34.8134 KOps/s | 35.0721 KOps/s | $\color{#d91a1a}-0.74\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 39.8910μs | 18.1132μs | 55.2084 KOps/s | 53.5823 KOps/s | $\color{#35bf28}+3.03\\%$ | | test_values[generalized_advantage_estimate-True-True] | 25.9685ms | 24.7950ms | 40.3307 Ops/s | 40.3121 Ops/s | $\color{#35bf28}+0.05\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 97.2613ms | 2.8480ms | 351.1264 Ops/s | 369.8227 Ops/s | $\textbf{\color{#d91a1a}-5.06\\%}$ | | test_values[td0_return_estimate-False-False] | 93.0920μs | 65.5982μs | 15.2443 KOps/s | 15.0629 KOps/s | $\color{#35bf28}+1.20\\%$ | | test_values[td1_return_estimate-False-False] | 58.1686ms | 57.7145ms | 17.3267 Ops/s | 17.7837 Ops/s | $\color{#d91a1a}-2.57\\%$ | | test_values[vec_td1_return_estimate-False-False] | 1.3415ms | 1.1024ms | 907.1528 Ops/s | 912.7900 Ops/s | $\color{#d91a1a}-0.62\\%$ | | test_values[td_lambda_return_estimate-True-False] | 91.8175ms | 88.1757ms | 11.3410 Ops/s | 10.9367 Ops/s | $\color{#35bf28}+3.70\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 1.2362ms | 1.0810ms | 925.1116 Ops/s | 919.2964 Ops/s | $\color{#35bf28}+0.63\\%$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 24.4873ms | 24.1528ms | 41.4030 Ops/s | 38.7070 Ops/s | $\textbf{\color{#35bf28}+6.97\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9562ms | 0.7150ms | 1.3985 KOps/s | 1.3834 KOps/s | $\color{#35bf28}+1.09\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8119ms | 0.6660ms | 1.5014 KOps/s | 1.4751 KOps/s | $\color{#35bf28}+1.78\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.6480ms | 1.4719ms | 679.3850 Ops/s | 678.4224 Ops/s | $\color{#35bf28}+0.14\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.8508ms | 0.6822ms | 1.4658 KOps/s | 1.4446 KOps/s | $\color{#35bf28}+1.47\\%$ | | test_dqn_speed | 80.7603ms | 1.6192ms | 617.5832 Ops/s | 667.2107 Ops/s | $\textbf{\color{#d91a1a}-7.44\\%}$ | | test_ddpg_speed | 3.3372ms | 3.0409ms | 328.8531 Ops/s | 326.3578 Ops/s | $\color{#35bf28}+0.76\\%$ | | test_sac_speed | 9.6497ms | 8.7401ms | 114.4149 Ops/s | 114.4309 Ops/s | $\color{#d91a1a}-0.01\\%$ | | test_redq_speed | 12.5090ms | 10.9290ms | 91.4997 Ops/s | 91.4232 Ops/s | $\color{#35bf28}+0.08\\%$ | | test_redq_deprec_speed | 0.1076s | 13.0964ms | 76.3567 Ops/s | 84.6795 Ops/s | $\textbf{\color{#d91a1a}-9.83\\%}$ | | test_td3_speed | 8.7574ms | 8.5979ms | 116.3070 Ops/s | 114.9975 Ops/s | $\color{#35bf28}+1.14\\%$ | | test_cql_speed | 27.3499ms | 26.3722ms | 37.9187 Ops/s | 37.9943 Ops/s | $\color{#d91a1a}-0.20\\%$ | | test_a2c_speed | 6.5456ms | 5.7685ms | 173.3539 Ops/s | 170.8525 Ops/s | $\color{#35bf28}+1.46\\%$ | | test_ppo_speed | 6.5521ms | 6.1016ms | 163.8920 Ops/s | 162.4529 Ops/s | $\color{#35bf28}+0.89\\%$ | | test_reinforce_speed | 5.1236ms | 4.7635ms | 209.9310 Ops/s | 210.5828 Ops/s | $\color{#d91a1a}-0.31\\%$ | | test_iql_speed | 20.9052ms | 20.0349ms | 49.9128 Ops/s | 49.8790 Ops/s | $\color{#35bf28}+0.07\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.9005ms | 4.6577ms | 214.6961 Ops/s | 217.1166 Ops/s | $\color{#d91a1a}-1.11\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8029ms | 0.6018ms | 1.6617 KOps/s | 1.6395 KOps/s | $\color{#35bf28}+1.36\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7753ms | 0.5826ms | 1.7164 KOps/s | 1.7028 KOps/s | $\color{#35bf28}+0.80\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.1310ms | 4.6199ms | 216.4551 Ops/s | 217.7050 Ops/s | $\color{#d91a1a}-0.57\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8492ms | 0.5986ms | 1.6706 KOps/s | 1.6703 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 4.4101ms | 0.5794ms | 1.7260 KOps/s | 1.7133 KOps/s | $\color{#35bf28}+0.74\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.4128ms | 2.1662ms | 461.6347 Ops/s | 455.3366 Ops/s | $\color{#35bf28}+1.38\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 5.8309ms | 2.0765ms | 481.5808 Ops/s | 481.3594 Ops/s | $\color{#35bf28}+0.05\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.1239ms | 4.7338ms | 211.2486 Ops/s | 211.0884 Ops/s | $\color{#35bf28}+0.08\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.1221s | 0.8759ms | 1.1417 KOps/s | 1.3060 KOps/s | $\textbf{\color{#d91a1a}-12.58\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9888ms | 0.7329ms | 1.3644 KOps/s | 1.3547 KOps/s | $\color{#35bf28}+0.71\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.8929ms | 4.6285ms | 216.0530 Ops/s | 216.1457 Ops/s | $\color{#d91a1a}-0.04\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.3150ms | 0.6080ms | 1.6447 KOps/s | 1.6377 KOps/s | $\color{#35bf28}+0.43\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.8277ms | 0.5859ms | 1.7066 KOps/s | 1.6934 KOps/s | $\color{#35bf28}+0.78\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.9625ms | 4.6033ms | 217.2333 Ops/s | 217.4444 Ops/s | $\color{#d91a1a}-0.10\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7522ms | 0.5934ms | 1.6852 KOps/s | 1.6574 KOps/s | $\color{#35bf28}+1.68\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7408ms | 0.5742ms | 1.7414 KOps/s | 1.7138 KOps/s | $\color{#35bf28}+1.61\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.0785ms | 4.7943ms | 208.5791 Ops/s | 208.0165 Ops/s | $\color{#35bf28}+0.27\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.5070ms | 0.7586ms | 1.3181 KOps/s | 1.3079 KOps/s | $\color{#35bf28}+0.78\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9733ms | 0.7349ms | 1.3607 KOps/s | 1.3509 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1461s | 7.8137ms | 127.9811 Ops/s | 132.6559 Ops/s | $\color{#d91a1a}-3.52\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 18.8200ms | 16.1088ms | 62.0780 Ops/s | 62.5911 Ops/s | $\color{#d91a1a}-0.82\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 7.3021ms | 1.5228ms | 656.6701 Ops/s | 732.4597 Ops/s | $\textbf{\color{#d91a1a}-10.35\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1254s | 7.3898ms | 135.3218 Ops/s | 135.0869 Ops/s | $\color{#35bf28}+0.17\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 18.4446ms | 16.0418ms | 62.3371 Ops/s | 63.4517 Ops/s | $\color{#d91a1a}-1.76\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 6.9812ms | 1.4579ms | 685.8998 Ops/s | 735.9413 Ops/s | $\textbf{\color{#d91a1a}-6.80\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1260s | 9.9931ms | 100.0695 Ops/s | 98.9311 Ops/s | $\color{#35bf28}+1.15\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 18.8427ms | 16.1621ms | 61.8733 Ops/s | 60.1489 Ops/s | $\color{#35bf28}+2.87\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.6646ms | 1.5370ms | 650.6179 Ops/s | 659.2348 Ops/s | $\color{#d91a1a}-1.31\\%$ |