pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.19k stars 289 forks source link

[Feature,Doc] `get_stateful_net` and document loss initialization #2310

Closed vmoens closed 1 month ago

vmoens commented 1 month ago

Stack from ghstack (oldest at bottom):

pytorch-bot[bot] commented 1 month ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2310

Note: Links to docs will display an error until the docs builds have been completed.

:x: 1 New Failure, 1 Pending, 3 Unrelated Failures

As of commit d9b9eb8888ce42cdfac8f6fed116ec17b29838c9 with merge base 59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad (image):

NEW FAILURE - The following job has failed:

* [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2310#27796316041) ([gh](https://github.com/pytorch/rl/actions/runs/10056771244/job/27796316041)) `RuntimeError: Command docker exec -t da11bf67a86ce6a28fa8474e674f538114a1b987a884a3873a33565814b4e71e /exec failed with exit code 139`

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

* [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2310#27796323247) ([gh](https://github.com/pytorch/rl/actions/runs/10056771240/job/27796323247)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad#27753415500)) `AttributeError: module 'torch' has no attribute 'compiler'` * [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2310#27796319942) ([gh](https://github.com/pytorch/rl/actions/runs/10056771245/job/27796319942)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad#27753414102)) `AttributeError: module 'torch' has no attribute 'compiler'` * [Unit-tests on Windows / unittests-cpu / windows-job](https://hud.pytorch.org/pr/pytorch/rl/2310#27796315095) ([gh](https://github.com/pytorch/rl/actions/runs/10056771242/job/27796315095)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/59c3374162efb9f3436ec1b8e9b2c76a03b2a7ad#27753409046)) `test/test_transforms.py::TestActionDiscretizer::test_trans_parallel_env_check[False]`

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions[bot] commented 1 month ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_single | 58.9870ms | 58.7032ms | 17.0348 Ops/s | 16.8897 Ops/s | $\color{#35bf28}+0.86\\%$ | | test_sync | 33.7244ms | 31.6287ms | 31.6169 Ops/s | 31.6129 Ops/s | $\color{#35bf28}+0.01\\%$ | | test_async | 58.7626ms | 30.0335ms | 33.2961 Ops/s | 32.6839 Ops/s | $\color{#35bf28}+1.87\\%$ | | test_simple | 0.4977s | 0.4210s | 2.3753 Ops/s | 2.3766 Ops/s | $\color{#d91a1a}-0.06\\%$ | | test_transformed | 0.6511s | 0.5833s | 1.7143 Ops/s | 1.6991 Ops/s | $\color{#35bf28}+0.90\\%$ | | test_serial | 1.3520s | 1.2838s | 0.7789 Ops/s | 0.7815 Ops/s | $\color{#d91a1a}-0.33\\%$ | | test_parallel | 1.1702s | 1.1178s | 0.8946 Ops/s | 0.8907 Ops/s | $\color{#35bf28}+0.44\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.1977ms | 25.1245μs | 39.8018 KOps/s | 40.2501 KOps/s | $\color{#d91a1a}-1.11\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 41.4370μs | 14.5937μs | 68.5228 KOps/s | 68.0858 KOps/s | $\color{#35bf28}+0.64\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 48.3100μs | 14.5483μs | 68.7364 KOps/s | 68.5193 KOps/s | $\color{#35bf28}+0.32\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 29.4450μs | 8.4239μs | 118.7102 KOps/s | 117.7133 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 87.1600μs | 26.7230μs | 37.4209 KOps/s | 37.2515 KOps/s | $\color{#35bf28}+0.45\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 66.2500μs | 16.1703μs | 61.8419 KOps/s | 62.1883 KOps/s | $\color{#d91a1a}-0.56\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 54.8120μs | 16.1234μs | 62.0215 KOps/s | 62.5786 KOps/s | $\color{#d91a1a}-0.89\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 59.1600μs | 9.9970μs | 100.0297 KOps/s | 100.0405 KOps/s | $\color{#d91a1a}-0.01\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 65.0210μs | 28.5675μs | 35.0049 KOps/s | 34.9398 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 62.9770μs | 17.9135μs | 55.8237 KOps/s | 55.9989 KOps/s | $\color{#d91a1a}-0.31\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 86.6920μs | 15.9926μs | 62.5291 KOps/s | 61.9301 KOps/s | $\color{#35bf28}+0.97\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 64.9010μs | 9.8923μs | 101.0882 KOps/s | 98.0179 KOps/s | $\color{#35bf28}+3.13\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 79.8590μs | 29.9928μs | 33.3413 KOps/s | 33.3585 KOps/s | $\color{#d91a1a}-0.05\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 42.1180μs | 19.4285μs | 51.4707 KOps/s | 51.8462 KOps/s | $\color{#d91a1a}-0.72\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 60.4530μs | 17.5126μs | 57.1018 KOps/s | 56.4606 KOps/s | $\color{#35bf28}+1.14\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 43.7110μs | 11.5615μs | 86.4940 KOps/s | 87.0492 KOps/s | $\color{#d91a1a}-0.64\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 0.1310ms | 29.8094μs | 33.5464 KOps/s | 34.5531 KOps/s | $\color{#d91a1a}-2.91\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 68.3270μs | 17.9000μs | 55.8660 KOps/s | 56.2448 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 54.6620μs | 18.9258μs | 52.8378 KOps/s | 51.9841 KOps/s | $\color{#35bf28}+1.64\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 76.1950μs | 11.3061μs | 88.4479 KOps/s | 87.8572 KOps/s | $\color{#35bf28}+0.67\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 59.8320μs | 29.8378μs | 33.5145 KOps/s | 33.0009 KOps/s | $\color{#35bf28}+1.56\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 73.6200μs | 19.1847μs | 52.1248 KOps/s | 51.5122 KOps/s | $\color{#35bf28}+1.19\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 51.9170μs | 19.9391μs | 50.1528 KOps/s | 49.4202 KOps/s | $\color{#35bf28}+1.48\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 60.5700μs | 12.6708μs | 78.9216 KOps/s | 78.4702 KOps/s | $\color{#35bf28}+0.58\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 3.4033ms | 31.6112μs | 31.6343 KOps/s | 30.7020 KOps/s | $\color{#35bf28}+3.04\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 69.8700μs | 20.8117μs | 48.0499 KOps/s | 47.7908 KOps/s | $\color{#35bf28}+0.54\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 0.2625ms | 19.9711μs | 50.0723 KOps/s | 49.1437 KOps/s | $\color{#35bf28}+1.89\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 68.4380μs | 12.7678μs | 78.3221 KOps/s | 78.1683 KOps/s | $\color{#35bf28}+0.20\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 81.3220μs | 32.9959μs | 30.3068 KOps/s | 30.5112 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 67.7660μs | 22.3640μs | 44.7146 KOps/s | 44.7024 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 65.3210μs | 21.2902μs | 46.9699 KOps/s | 46.5452 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 62.7170μs | 14.0630μs | 71.1084 KOps/s | 70.3416 KOps/s | $\color{#35bf28}+1.09\\%$ | | test_values[generalized_advantage_estimate-True-True] | 10.4996ms | 9.5123ms | 105.1266 Ops/s | 102.4254 Ops/s | $\color{#35bf28}+2.64\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 35.3173ms | 33.3346ms | 29.9989 Ops/s | 28.1436 Ops/s | $\textbf{\color{#35bf28}+6.59\\%}$ | | test_values[td0_return_estimate-False-False] | 0.2228ms | 0.1755ms | 5.6970 KOps/s | 5.3948 KOps/s | $\textbf{\color{#35bf28}+5.60\\%}$ | | test_values[td1_return_estimate-False-False] | 23.8409ms | 23.4260ms | 42.6876 Ops/s | 41.4633 Ops/s | $\color{#35bf28}+2.95\\%$ | | test_values[vec_td1_return_estimate-False-False] | 35.3235ms | 33.3899ms | 29.9491 Ops/s | 28.0073 Ops/s | $\textbf{\color{#35bf28}+6.93\\%}$ | | test_values[td_lambda_return_estimate-True-False] | 34.6124ms | 33.8791ms | 29.5167 Ops/s | 28.7441 Ops/s | $\color{#35bf28}+2.69\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 35.8376ms | 33.4501ms | 29.8953 Ops/s | 27.5886 Ops/s | $\textbf{\color{#35bf28}+8.36\\%}$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 12.4003ms | 8.4171ms | 118.8060 Ops/s | 118.4238 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.3902ms | 2.0507ms | 487.6493 Ops/s | 488.6995 Ops/s | $\color{#d91a1a}-0.21\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.6784ms | 0.3570ms | 2.8014 KOps/s | 2.5855 KOps/s | $\textbf{\color{#35bf28}+8.35\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 48.5414ms | 46.4355ms | 21.5352 Ops/s | 21.9552 Ops/s | $\color{#d91a1a}-1.91\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 4.0173ms | 3.0622ms | 326.5602 Ops/s | 325.3175 Ops/s | $\color{#35bf28}+0.38\\%$ | | test_dqn_speed | 2.1236ms | 1.3863ms | 721.3563 Ops/s | 718.1318 Ops/s | $\color{#35bf28}+0.45\\%$ | | test_ddpg_speed | 3.8496ms | 2.9211ms | 342.3352 Ops/s | 338.0473 Ops/s | $\color{#35bf28}+1.27\\%$ | | test_sac_speed | 8.8195ms | 8.4179ms | 118.7944 Ops/s | 117.2248 Ops/s | $\color{#35bf28}+1.34\\%$ | | test_redq_speed | 15.6365ms | 13.6187ms | 73.4283 Ops/s | 72.1501 Ops/s | $\color{#35bf28}+1.77\\%$ | | test_redq_deprec_speed | 14.5426ms | 13.2535ms | 75.4518 Ops/s | 73.1821 Ops/s | $\color{#35bf28}+3.10\\%$ | | test_td3_speed | 8.8657ms | 8.3927ms | 119.1515 Ops/s | 115.7634 Ops/s | $\color{#35bf28}+2.93\\%$ | | test_cql_speed | 42.4076ms | 37.0760ms | 26.9716 Ops/s | 26.0741 Ops/s | $\color{#35bf28}+3.44\\%$ | | test_a2c_speed | 8.6959ms | 7.4836ms | 133.6257 Ops/s | 131.5380 Ops/s | $\color{#35bf28}+1.59\\%$ | | test_ppo_speed | 11.3729ms | 7.9058ms | 126.4900 Ops/s | 127.2007 Ops/s | $\color{#d91a1a}-0.56\\%$ | | test_reinforce_speed | 9.6252ms | 6.6842ms | 149.6061 Ops/s | 149.4101 Ops/s | $\color{#35bf28}+0.13\\%$ | | test_iql_speed | 33.6140ms | 32.5453ms | 30.7264 Ops/s | 30.6105 Ops/s | $\color{#35bf28}+0.38\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1898ms | 4.8532ms | 206.0498 Ops/s | 197.4673 Ops/s | $\color{#35bf28}+4.35\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.7010ms | 0.4791ms | 2.0873 KOps/s | 2.0631 KOps/s | $\color{#35bf28}+1.17\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6864ms | 0.4768ms | 2.0974 KOps/s | 2.0618 KOps/s | $\color{#35bf28}+1.73\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.5404ms | 4.8805ms | 204.8969 Ops/s | 199.4281 Ops/s | $\color{#35bf28}+2.74\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.2927ms | 0.4717ms | 2.1199 KOps/s | 2.0556 KOps/s | $\color{#35bf28}+3.13\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.8362ms | 0.4562ms | 2.1922 KOps/s | 2.1565 KOps/s | $\color{#35bf28}+1.66\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.9138ms | 1.6982ms | 588.8503 Ops/s | 587.8674 Ops/s | $\color{#35bf28}+0.17\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.9021ms | 1.6053ms | 622.9438 Ops/s | 623.0532 Ops/s | $\color{#d91a1a}-0.02\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 8.1434ms | 5.1834ms | 192.9218 Ops/s | 193.1168 Ops/s | $\color{#d91a1a}-0.10\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0566ms | 0.6189ms | 1.6158 KOps/s | 1.5912 KOps/s | $\color{#35bf28}+1.55\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8204ms | 0.5950ms | 1.6808 KOps/s | 1.6567 KOps/s | $\color{#35bf28}+1.45\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.8268ms | 5.0115ms | 199.5426 Ops/s | 197.5122 Ops/s | $\color{#35bf28}+1.03\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.4259ms | 0.4901ms | 2.0405 KOps/s | 2.0599 KOps/s | $\color{#d91a1a}-0.94\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6407ms | 0.4786ms | 2.0896 KOps/s | 2.1069 KOps/s | $\color{#d91a1a}-0.82\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.4252ms | 5.0580ms | 197.7081 Ops/s | 201.9278 Ops/s | $\color{#d91a1a}-2.09\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7822ms | 0.4886ms | 2.0467 KOps/s | 2.0558 KOps/s | $\color{#d91a1a}-0.44\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6104ms | 0.4646ms | 2.1525 KOps/s | 2.1131 KOps/s | $\color{#35bf28}+1.86\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.5953ms | 5.3766ms | 185.9929 Ops/s | 192.5434 Ops/s | $\color{#d91a1a}-3.40\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.4794ms | 0.6344ms | 1.5763 KOps/s | 1.5838 KOps/s | $\color{#d91a1a}-0.48\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.1311s | 0.7729ms | 1.2938 KOps/s | 1.6524 KOps/s | $\textbf{\color{#d91a1a}-21.70\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1214s | 6.2000ms | 161.2890 Ops/s | 163.5240 Ops/s | $\color{#d91a1a}-1.37\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 17.5479ms | 13.0872ms | 76.4107 Ops/s | 76.3372 Ops/s | $\color{#35bf28}+0.10\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.6892ms | 1.1219ms | 891.3247 Ops/s | 904.8503 Ops/s | $\color{#d91a1a}-1.49\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1223s | 6.2129ms | 160.9551 Ops/s | 119.3276 Ops/s | $\textbf{\color{#35bf28}+34.89\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 17.0046ms | 12.8716ms | 77.6903 Ops/s | 76.8184 Ops/s | $\color{#35bf28}+1.14\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.6728ms | 1.1235ms | 890.1095 Ops/s | 837.5353 Ops/s | $\textbf{\color{#35bf28}+6.28\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1233s | 8.6629ms | 115.4352 Ops/s | 161.4628 Ops/s | $\textbf{\color{#d91a1a}-28.51\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 24.1269ms | 13.1121ms | 76.2654 Ops/s | 75.0450 Ops/s | $\color{#35bf28}+1.63\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.7428ms | 1.2439ms | 803.9115 Ops/s | 721.8118 Ops/s | $\textbf{\color{#35bf28}+11.37\\%}$ |
github-actions[bot] commented 1 month ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | -------------- | ------------------ | ----------------------------------- | | test_single | 0.1061s | 0.1059s | 9.4467 Ops/s | 8.8320 Ops/s | $\textbf{\color{#35bf28}+6.96\\%}$ | | test_sync | 95.4503ms | 94.3731ms | 10.5962 Ops/s | 10.5410 Ops/s | $\color{#35bf28}+0.52\\%$ | | test_async | 0.1835s | 86.7606ms | 11.5260 Ops/s | 11.0528 Ops/s | $\color{#35bf28}+4.28\\%$ | | test_single_pixels | 0.1165s | 0.1163s | 8.5988 Ops/s | 8.7025 Ops/s | $\color{#d91a1a}-1.19\\%$ | | test_sync_pixels | 75.7561ms | 73.3038ms | 13.6419 Ops/s | 13.4884 Ops/s | $\color{#35bf28}+1.14\\%$ | | test_async_pixels | 0.1410s | 69.8600ms | 14.3143 Ops/s | 14.3770 Ops/s | $\color{#d91a1a}-0.44\\%$ | | test_simple | 0.8603s | 0.7828s | 1.2774 Ops/s | 1.2948 Ops/s | $\color{#d91a1a}-1.34\\%$ | | test_transformed | 1.0796s | 1.0082s | 0.9919 Ops/s | 1.0203 Ops/s | $\color{#d91a1a}-2.79\\%$ | | test_serial | 2.2906s | 2.2186s | 0.4507 Ops/s | 0.4558 Ops/s | $\color{#d91a1a}-1.12\\%$ | | test_parallel | 2.1101s | 1.9779s | 0.5056 Ops/s | 0.5146 Ops/s | $\color{#d91a1a}-1.75\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 99.1120μs | 38.2417μs | 26.1495 KOps/s | 26.7826 KOps/s | $\color{#d91a1a}-2.36\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 0.1961ms | 21.5133μs | 46.4830 KOps/s | 47.3159 KOps/s | $\color{#d91a1a}-1.76\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 46.9910μs | 21.4153μs | 46.6955 KOps/s | 46.9902 KOps/s | $\color{#d91a1a}-0.63\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 0.1664ms | 12.1567μs | 82.2595 KOps/s | 83.1106 KOps/s | $\color{#d91a1a}-1.02\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 59.5710μs | 40.3890μs | 24.7592 KOps/s | 25.1032 KOps/s | $\color{#d91a1a}-1.37\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 45.7510μs | 23.4063μs | 42.7235 KOps/s | 42.7080 KOps/s | $\color{#35bf28}+0.04\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 49.9610μs | 23.7261μs | 42.1478 KOps/s | 42.5114 KOps/s | $\color{#d91a1a}-0.86\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 38.6010μs | 14.3921μs | 69.4826 KOps/s | 69.6374 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 87.6620μs | 42.3673μs | 23.6031 KOps/s | 24.1053 KOps/s | $\color{#d91a1a}-2.08\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 49.4110μs | 25.8005μs | 38.7589 KOps/s | 38.3262 KOps/s | $\color{#35bf28}+1.13\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 59.6210μs | 23.6729μs | 42.2424 KOps/s | 42.4593 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 36.6010μs | 14.5476μs | 68.7401 KOps/s | 70.0753 KOps/s | $\color{#d91a1a}-1.91\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 72.4520μs | 44.6023μs | 22.4204 KOps/s | 22.4548 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 0.2264ms | 28.1762μs | 35.4909 KOps/s | 35.4695 KOps/s | $\color{#35bf28}+0.06\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 51.3620μs | 26.2028μs | 38.1639 KOps/s | 38.7914 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 0.2158ms | 16.7908μs | 59.5564 KOps/s | 60.2276 KOps/s | $\color{#d91a1a}-1.11\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 0.1084ms | 42.5778μs | 23.4864 KOps/s | 24.1619 KOps/s | $\color{#d91a1a}-2.80\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 52.8910μs | 26.1858μs | 38.1886 KOps/s | 39.2022 KOps/s | $\color{#d91a1a}-2.59\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 47.7610μs | 28.4175μs | 35.1896 KOps/s | 36.0186 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 36.0110μs | 16.7345μs | 59.7568 KOps/s | 61.2024 KOps/s | $\color{#d91a1a}-2.36\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 79.5120μs | 44.6462μs | 22.3983 KOps/s | 22.9067 KOps/s | $\color{#d91a1a}-2.22\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 50.9810μs | 27.9488μs | 35.7798 KOps/s | 36.1724 KOps/s | $\color{#d91a1a}-1.09\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 98.0220μs | 30.4644μs | 32.8252 KOps/s | 33.6465 KOps/s | $\color{#d91a1a}-2.44\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 0.1428ms | 18.9671μs | 52.7229 KOps/s | 54.0127 KOps/s | $\color{#d91a1a}-2.39\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 3.9079ms | 47.4526μs | 21.0737 KOps/s | 21.1845 KOps/s | $\color{#d91a1a}-0.52\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 0.1384ms | 30.8002μs | 32.4674 KOps/s | 32.6584 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 49.5610μs | 30.5925μs | 32.6878 KOps/s | 33.4323 KOps/s | $\color{#d91a1a}-2.23\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 46.7610μs | 18.9243μs | 52.8421 KOps/s | 53.7841 KOps/s | $\color{#d91a1a}-1.75\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 74.5010μs | 48.4225μs | 20.6516 KOps/s | 20.8394 KOps/s | $\color{#d91a1a}-0.90\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 60.7820μs | 33.3313μs | 30.0018 KOps/s | 30.8190 KOps/s | $\color{#d91a1a}-2.65\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 56.8110μs | 32.2118μs | 31.0445 KOps/s | 31.4537 KOps/s | $\color{#d91a1a}-1.30\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 56.1910μs | 20.5420μs | 48.6807 KOps/s | 48.5674 KOps/s | $\color{#35bf28}+0.23\\%$ | | test_values[generalized_advantage_estimate-True-True] | 23.9540ms | 23.5183ms | 42.5200 Ops/s | 43.4266 Ops/s | $\color{#d91a1a}-2.09\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 87.8442ms | 2.6417ms | 378.5477 Ops/s | 372.5225 Ops/s | $\color{#35bf28}+1.62\\%$ | | test_values[td0_return_estimate-False-False] | 89.2820μs | 64.9005μs | 15.4082 KOps/s | 15.8063 KOps/s | $\color{#d91a1a}-2.52\\%$ | | test_values[td1_return_estimate-False-False] | 53.8063ms | 53.1333ms | 18.8206 Ops/s | 18.9959 Ops/s | $\color{#d91a1a}-0.92\\%$ | | test_values[vec_td1_return_estimate-False-False] | 1.3791ms | 1.0673ms | 936.9239 Ops/s | 938.6507 Ops/s | $\color{#d91a1a}-0.18\\%$ | | test_values[td_lambda_return_estimate-True-False] | 85.0601ms | 84.0606ms | 11.8962 Ops/s | 12.0184 Ops/s | $\color{#d91a1a}-1.02\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 1.4163ms | 1.0670ms | 937.2325 Ops/s | 942.9253 Ops/s | $\color{#d91a1a}-0.60\\%$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 23.6165ms | 23.3742ms | 42.7822 Ops/s | 43.2547 Ops/s | $\color{#d91a1a}-1.09\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9435ms | 0.6977ms | 1.4333 KOps/s | 1.4405 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7263ms | 0.6487ms | 1.5414 KOps/s | 1.5506 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.6354ms | 1.4508ms | 689.2857 Ops/s | 690.6693 Ops/s | $\color{#d91a1a}-0.20\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.8421ms | 0.6671ms | 1.4990 KOps/s | 1.4700 KOps/s | $\color{#35bf28}+1.98\\%$ | | test_dqn_speed | 7.1659ms | 1.4510ms | 689.1892 Ops/s | 706.9389 Ops/s | $\color{#d91a1a}-2.51\\%$ | | test_ddpg_speed | 3.2130ms | 2.9667ms | 337.0722 Ops/s | 341.3187 Ops/s | $\color{#d91a1a}-1.24\\%$ | | test_sac_speed | 0.1021s | 9.1669ms | 109.0882 Ops/s | 120.6396 Ops/s | $\textbf{\color{#d91a1a}-9.58\\%}$ | | test_redq_speed | 12.7997ms | 11.0868ms | 90.1972 Ops/s | 92.1440 Ops/s | $\color{#d91a1a}-2.11\\%$ | | test_redq_deprec_speed | 12.0820ms | 11.4872ms | 87.0534 Ops/s | 88.3236 Ops/s | $\color{#d91a1a}-1.44\\%$ | | test_td3_speed | 8.6270ms | 8.3113ms | 120.3180 Ops/s | 121.2000 Ops/s | $\color{#d91a1a}-0.73\\%$ | | test_cql_speed | 26.5973ms | 26.0062ms | 38.4524 Ops/s | 35.1101 Ops/s | $\textbf{\color{#35bf28}+9.52\\%}$ | | test_a2c_speed | 5.9275ms | 5.7212ms | 174.7887 Ops/s | 175.8109 Ops/s | $\color{#d91a1a}-0.58\\%$ | | test_ppo_speed | 6.2204ms | 6.0312ms | 165.8040 Ops/s | 166.7160 Ops/s | $\color{#d91a1a}-0.55\\%$ | | test_reinforce_speed | 5.5170ms | 4.6598ms | 214.6005 Ops/s | 217.2618 Ops/s | $\color{#d91a1a}-1.22\\%$ | | test_iql_speed | 20.9150ms | 20.2151ms | 49.4679 Ops/s | 49.6912 Ops/s | $\color{#d91a1a}-0.45\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.9453ms | 6.7574ms | 147.9852 Ops/s | 152.6293 Ops/s | $\color{#d91a1a}-3.04\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9264ms | 0.5108ms | 1.9577 KOps/s | 1.9475 KOps/s | $\color{#35bf28}+0.52\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6656ms | 0.4915ms | 2.0346 KOps/s | 2.0309 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.9746ms | 6.6503ms | 150.3691 Ops/s | 154.7429 Ops/s | $\color{#d91a1a}-2.83\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7575ms | 0.5055ms | 1.9783 KOps/s | 1.9684 KOps/s | $\color{#35bf28}+0.50\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6692ms | 0.4860ms | 2.0575 KOps/s | 2.0611 KOps/s | $\color{#d91a1a}-0.18\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.4578ms | 1.9722ms | 507.0467 Ops/s | 508.6511 Ops/s | $\color{#d91a1a}-0.32\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 2.1164ms | 1.8469ms | 541.4483 Ops/s | 539.1507 Ops/s | $\color{#35bf28}+0.43\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.0661ms | 6.8519ms | 145.9443 Ops/s | 149.3623 Ops/s | $\color{#d91a1a}-2.29\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.1396s | 0.7811ms | 1.2803 KOps/s | 1.5194 KOps/s | $\textbf{\color{#d91a1a}-15.74\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8698ms | 0.6394ms | 1.5640 KOps/s | 1.5010 KOps/s | $\color{#35bf28}+4.20\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.9956ms | 6.7733ms | 147.6396 Ops/s | 152.4828 Ops/s | $\color{#d91a1a}-3.18\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.4622ms | 0.5145ms | 1.9437 KOps/s | 1.9168 KOps/s | $\color{#35bf28}+1.41\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7873ms | 0.4931ms | 2.0282 KOps/s | 2.0317 KOps/s | $\color{#d91a1a}-0.17\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.9901ms | 6.6346ms | 150.7251 Ops/s | 153.7894 Ops/s | $\color{#d91a1a}-1.99\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6089ms | 0.5079ms | 1.9690 KOps/s | 1.9791 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 4.2210ms | 0.4935ms | 2.0265 KOps/s | 2.0587 KOps/s | $\color{#d91a1a}-1.56\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.1556ms | 6.9198ms | 144.5136 Ops/s | 150.0431 Ops/s | $\color{#d91a1a}-3.69\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1479ms | 0.6651ms | 1.5036 KOps/s | 1.5055 KOps/s | $\color{#d91a1a}-0.12\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9315ms | 0.6412ms | 1.5595 KOps/s | 1.5685 KOps/s | $\color{#d91a1a}-0.58\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1417s | 7.9402ms | 125.9409 Ops/s | 126.6111 Ops/s | $\color{#d91a1a}-0.53\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 18.1582ms | 15.8901ms | 62.9321 Ops/s | 63.9968 Ops/s | $\color{#d91a1a}-1.66\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 2.2435ms | 1.2318ms | 811.8116 Ops/s | 774.8465 Ops/s | $\color{#35bf28}+4.77\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1265s | 7.7268ms | 129.4192 Ops/s | 133.5663 Ops/s | $\color{#d91a1a}-3.10\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1355s | 18.3177ms | 54.5919 Ops/s | 64.2263 Ops/s | $\textbf{\color{#d91a1a}-15.00\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.2103ms | 1.2288ms | 813.7827 Ops/s | 774.0544 Ops/s | $\textbf{\color{#35bf28}+5.13\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1244s | 7.7733ms | 128.6454 Ops/s | 129.4710 Ops/s | $\color{#d91a1a}-0.64\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 18.3245ms | 16.1035ms | 62.0982 Ops/s | 64.3036 Ops/s | $\color{#d91a1a}-3.43\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.3823ms | 1.3944ms | 717.1388 Ops/s | 720.4749 Ops/s | $\color{#d91a1a}-0.46\\%$ |