pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.01k stars 269 forks source link

[Feature] assinging values to RB storage #2224

Closed vmoens closed 4 weeks ago

vmoens commented 4 weeks ago

This PR enables assigning values to a RB.

This is slightly bc-breaking: Previously doing storage[index] = foo was exactly equivalent to storage.set(index, foo). Now only the second will update the cursor position. This lets us assigning values without telling the buffer that we have moved the cursor.

TODO:

cc @wertyuilife2

pytorch-bot[bot] commented 4 weeks ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2224

Note: Links to docs will display an error until the docs builds have been completed.

:heavy_exclamation_mark: 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

:x: 2 New Failures, 2 Unrelated Failures

As of commit 71381a2e5db2382e3cc2ff0b0c8c070ec0635618 with merge base 47a1005d27244cc984df88cdd36940cbc19f5fa3 (image):

NEW FAILURES - The following jobs have failed:

* [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2224#26121038411) ([gh](https://github.com/pytorch/rl/actions/runs/9480417128/job/26121038411)) `RuntimeError: Command docker exec -t dc88aa1564b633062060298143548141d25a151ad5dfc4f6155f4c296b9a79a8 /exec failed with exit code 139` * [Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2224#26121065556) ([gh](https://github.com/pytorch/rl/actions/runs/9480417130/job/26121065556)) `RuntimeError: Command docker exec -t 432f93a480dd7b5f7c000744d0b7654c713754f01787faae4667755b709ac72b /exec failed with exit code 1`

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

* [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2224#26121060811) ([gh](https://github.com/pytorch/rl/actions/runs/9480417138/job/26121060811)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/47a1005d27244cc984df88cdd36940cbc19f5fa3#26119114864)) `##[error]The operation was canceled.` * [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2224#26121066951) ([gh](https://github.com/pytorch/rl/actions/runs/9480417130/job/26121066951)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/47a1005d27244cc984df88cdd36940cbc19f5fa3#26119114192)) `##[error]The operation was canceled.`

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions[bot] commented 4 weeks ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_single | 0.1109s | 58.1882ms | 17.1856 Ops/s | 17.9137 Ops/s | $\color{#d91a1a}-4.06\\%$ | | test_sync | 42.2688ms | 36.7726ms | 27.1941 Ops/s | 32.5879 Ops/s | $\textbf{\color{#d91a1a}-16.55\\%}$ | | test_async | 56.1220ms | 28.8487ms | 34.6636 Ops/s | 34.6184 Ops/s | $\color{#35bf28}+0.13\\%$ | | test_simple | 0.4420s | 0.3872s | 2.5828 Ops/s | 2.6089 Ops/s | $\color{#d91a1a}-1.00\\%$ | | test_transformed | 0.5815s | 0.5357s | 1.8669 Ops/s | 1.8589 Ops/s | $\color{#35bf28}+0.43\\%$ | | test_serial | 1.3122s | 1.2565s | 0.7958 Ops/s | 0.7762 Ops/s | $\color{#35bf28}+2.53\\%$ | | test_parallel | 1.1291s | 1.0685s | 0.9359 Ops/s | 0.9392 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.1897ms | 21.6572μs | 46.1741 KOps/s | 46.8467 KOps/s | $\color{#d91a1a}-1.44\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 47.9300μs | 12.9335μs | 77.3188 KOps/s | 75.6651 KOps/s | $\color{#35bf28}+2.19\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 37.6610μs | 12.6259μs | 79.2021 KOps/s | 78.8373 KOps/s | $\color{#35bf28}+0.46\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 0.1673ms | 7.6201μs | 131.2318 KOps/s | 128.6271 KOps/s | $\color{#35bf28}+2.03\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 70.4520μs | 22.5320μs | 44.3812 KOps/s | 43.7586 KOps/s | $\color{#35bf28}+1.42\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 35.7270μs | 14.0030μs | 71.4133 KOps/s | 69.5546 KOps/s | $\color{#35bf28}+2.67\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 64.7110μs | 13.8363μs | 72.2736 KOps/s | 72.1625 KOps/s | $\color{#35bf28}+0.15\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 33.2430μs | 8.8349μs | 113.1876 KOps/s | 111.3807 KOps/s | $\color{#35bf28}+1.62\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 73.3470μs | 24.2596μs | 41.2208 KOps/s | 41.3464 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 41.8880μs | 15.5938μs | 64.1281 KOps/s | 64.3867 KOps/s | $\color{#d91a1a}-0.40\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 60.5130μs | 13.7719μs | 72.6117 KOps/s | 72.2223 KOps/s | $\color{#35bf28}+0.54\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 49.7040μs | 8.8032μs | 113.5946 KOps/s | 111.5252 KOps/s | $\color{#35bf28}+1.86\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 78.8180μs | 25.2598μs | 39.5886 KOps/s | 39.5685 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 68.0380μs | 16.6967μs | 59.8922 KOps/s | 59.3848 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 59.7120μs | 14.8691μs | 67.2533 KOps/s | 66.9062 KOps/s | $\color{#35bf28}+0.52\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 50.8250μs | 10.0709μs | 99.2957 KOps/s | 99.7531 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 50.0730μs | 24.0166μs | 41.6379 KOps/s | 41.5815 KOps/s | $\color{#35bf28}+0.14\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 60.2130μs | 15.6141μs | 64.0445 KOps/s | 63.6114 KOps/s | $\color{#35bf28}+0.68\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 41.2470μs | 16.0076μs | 62.4703 KOps/s | 61.9615 KOps/s | $\color{#35bf28}+0.82\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 58.5200μs | 10.0388μs | 99.6137 KOps/s | 98.3155 KOps/s | $\color{#35bf28}+1.32\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 49.4930μs | 25.0709μs | 39.8869 KOps/s | 39.4894 KOps/s | $\color{#35bf28}+1.01\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 69.3400μs | 16.8013μs | 59.5192 KOps/s | 59.9690 KOps/s | $\color{#d91a1a}-0.75\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 58.4490μs | 17.2122μs | 58.0983 KOps/s | 57.9159 KOps/s | $\color{#35bf28}+0.31\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 70.2410μs | 11.2632μs | 88.7849 KOps/s | 88.8773 KOps/s | $\color{#d91a1a}-0.10\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 62.5570μs | 26.5221μs | 37.7044 KOps/s | 37.4246 KOps/s | $\color{#35bf28}+0.75\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 71.2840μs | 18.1467μs | 55.1065 KOps/s | 55.2705 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 76.6140μs | 17.1084μs | 58.4510 KOps/s | 57.9430 KOps/s | $\color{#35bf28}+0.88\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 36.7590μs | 11.2618μs | 88.7960 KOps/s | 88.8538 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 40.6560μs | 27.6700μs | 36.1402 KOps/s | 30.7263 KOps/s | $\textbf{\color{#35bf28}+17.62\\%}$ | | test_step_mdp_speed[False-False-False-True-False] | 67.9470μs | 19.1553μs | 52.2048 KOps/s | 53.0900 KOps/s | $\color{#d91a1a}-1.67\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 48.1710μs | 17.9715μs | 55.6435 KOps/s | 55.4633 KOps/s | $\color{#35bf28}+0.32\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 63.9290μs | 12.1581μs | 82.2494 KOps/s | 82.3127 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_values[generalized_advantage_estimate-True-True] | 9.5934ms | 9.3919ms | 106.4750 Ops/s | 107.1365 Ops/s | $\color{#d91a1a}-0.62\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 37.6554ms | 35.0496ms | 28.5310 Ops/s | 30.0954 Ops/s | $\textbf{\color{#d91a1a}-5.20\\%}$ | | test_values[td0_return_estimate-False-False] | 0.2200ms | 0.1716ms | 5.8266 KOps/s | 5.9913 KOps/s | $\color{#d91a1a}-2.75\\%$ | | test_values[td1_return_estimate-False-False] | 26.4200ms | 23.3375ms | 42.8495 Ops/s | 42.8169 Ops/s | $\color{#35bf28}+0.08\\%$ | | test_values[vec_td1_return_estimate-False-False] | 38.0133ms | 35.3003ms | 28.3284 Ops/s | 29.9546 Ops/s | $\textbf{\color{#d91a1a}-5.43\\%}$ | | test_values[td_lambda_return_estimate-True-False] | 36.3389ms | 33.4710ms | 29.8766 Ops/s | 29.7355 Ops/s | $\color{#35bf28}+0.47\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 37.4933ms | 35.3501ms | 28.2885 Ops/s | 30.0445 Ops/s | $\textbf{\color{#d91a1a}-5.84\\%}$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.4296ms | 8.2567ms | 121.1140 Ops/s | 120.1513 Ops/s | $\color{#35bf28}+0.80\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.8866ms | 2.0104ms | 497.4182 Ops/s | 552.4983 Ops/s | $\textbf{\color{#d91a1a}-9.97\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4833ms | 0.3569ms | 2.8017 KOps/s | 2.8422 KOps/s | $\color{#d91a1a}-1.42\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 53.4614ms | 47.0585ms | 21.2501 Ops/s | 23.1524 Ops/s | $\textbf{\color{#d91a1a}-8.22\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.6550ms | 3.0303ms | 330.0010 Ops/s | 332.2410 Ops/s | $\color{#d91a1a}-0.67\\%$ | | test_dqn_speed | 1.5344ms | 1.3269ms | 753.6518 Ops/s | 730.5476 Ops/s | $\color{#35bf28}+3.16\\%$ | | test_ddpg_speed | 3.5327ms | 2.8599ms | 349.6630 Ops/s | 345.4219 Ops/s | $\color{#35bf28}+1.23\\%$ | | test_sac_speed | 9.3233ms | 8.5194ms | 117.3793 Ops/s | 117.0273 Ops/s | $\color{#35bf28}+0.30\\%$ | | test_redq_speed | 14.6156ms | 13.4285ms | 74.4685 Ops/s | 72.7578 Ops/s | $\color{#35bf28}+2.35\\%$ | | test_redq_deprec_speed | 14.7063ms | 13.6276ms | 73.3803 Ops/s | 71.8917 Ops/s | $\color{#35bf28}+2.07\\%$ | | test_td3_speed | 16.6222ms | 8.4936ms | 117.7361 Ops/s | 116.4366 Ops/s | $\color{#35bf28}+1.12\\%$ | | test_cql_speed | 38.5194ms | 37.0748ms | 26.9725 Ops/s | 26.9791 Ops/s | $\color{#d91a1a}-0.02\\%$ | | test_a2c_speed | 10.0948ms | 7.7421ms | 129.1642 Ops/s | 132.1702 Ops/s | $\color{#d91a1a}-2.27\\%$ | | test_ppo_speed | 8.9853ms | 7.8189ms | 127.8959 Ops/s | 128.4550 Ops/s | $\color{#d91a1a}-0.44\\%$ | | test_reinforce_speed | 7.8736ms | 6.6535ms | 150.2960 Ops/s | 149.0582 Ops/s | $\color{#35bf28}+0.83\\%$ | | test_iql_speed | 34.6413ms | 33.1049ms | 30.2070 Ops/s | 29.9386 Ops/s | $\color{#35bf28}+0.90\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.3194ms | 3.5601ms | 280.8932 Ops/s | 276.6948 Ops/s | $\color{#35bf28}+1.52\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9630ms | 0.4992ms | 2.0032 KOps/s | 1.9663 KOps/s | $\color{#35bf28}+1.88\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6350ms | 0.4735ms | 2.1119 KOps/s | 2.0334 KOps/s | $\color{#35bf28}+3.86\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.4021ms | 3.5674ms | 280.3200 Ops/s | 276.3138 Ops/s | $\color{#35bf28}+1.45\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.7637ms | 0.5152ms | 1.9411 KOps/s | 2.0028 KOps/s | $\color{#d91a1a}-3.08\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6976ms | 0.4722ms | 2.1177 KOps/s | 2.0976 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.2195ms | 1.7043ms | 586.7524 Ops/s | 589.2376 Ops/s | $\color{#d91a1a}-0.42\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 4.5091ms | 1.6244ms | 615.6298 Ops/s | 621.8632 Ops/s | $\color{#d91a1a}-1.00\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.2613ms | 3.6937ms | 270.7330 Ops/s | 273.6307 Ops/s | $\color{#d91a1a}-1.06\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.1084s | 0.7041ms | 1.4203 KOps/s | 1.3919 KOps/s | $\color{#35bf28}+2.04\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7532ms | 0.5857ms | 1.7073 KOps/s | 1.6777 KOps/s | $\color{#35bf28}+1.76\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.2186ms | 3.5992ms | 277.8370 Ops/s | 278.2373 Ops/s | $\color{#d91a1a}-0.14\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.0994ms | 0.5018ms | 1.9929 KOps/s | 1.9393 KOps/s | $\color{#35bf28}+2.76\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6218ms | 0.4745ms | 2.1076 KOps/s | 2.0538 KOps/s | $\color{#35bf28}+2.62\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.2557ms | 3.5001ms | 285.7030 Ops/s | 272.2156 Ops/s | $\color{#35bf28}+4.95\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6000ms | 0.4958ms | 2.0169 KOps/s | 1.9708 KOps/s | $\color{#35bf28}+2.34\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.6638ms | 0.4826ms | 2.0721 KOps/s | 2.0500 KOps/s | $\color{#35bf28}+1.08\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.5835ms | 3.6599ms | 273.2316 Ops/s | 272.8304 Ops/s | $\color{#35bf28}+0.15\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.7212ms | 0.6159ms | 1.6235 KOps/s | 1.5946 KOps/s | $\color{#35bf28}+1.81\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8287ms | 0.5959ms | 1.6781 KOps/s | 1.6518 KOps/s | $\color{#35bf28}+1.59\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1147s | 6.0128ms | 166.3132 Ops/s | 168.6338 Ops/s | $\color{#d91a1a}-1.38\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1165s | 14.6508ms | 68.2555 Ops/s | 67.2764 Ops/s | $\color{#35bf28}+1.46\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.5397ms | 1.0327ms | 968.3299 Ops/s | 965.6560 Ops/s | $\color{#35bf28}+0.28\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1025s | 5.7222ms | 174.7564 Ops/s | 175.3172 Ops/s | $\color{#d91a1a}-0.32\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 14.5782ms | 12.5111ms | 79.9292 Ops/s | 79.5834 Ops/s | $\color{#35bf28}+0.43\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 3.1603ms | 1.3552ms | 737.9097 Ops/s | 963.2117 Ops/s | $\textbf{\color{#d91a1a}-23.39\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1044s | 5.8688ms | 170.3935 Ops/s | 168.3958 Ops/s | $\color{#35bf28}+1.19\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 15.4933ms | 12.6279ms | 79.1895 Ops/s | 67.5917 Ops/s | $\textbf{\color{#35bf28}+17.16\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 4.2584ms | 1.2676ms | 788.8630 Ops/s | 828.1399 Ops/s | $\color{#d91a1a}-4.74\\%$ |
github-actions[bot] commented 4 weeks ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | -------------- | ------------------ | ----------------------------------- | | test_single | 0.1175s | 0.1167s | 8.5670 Ops/s | 8.6324 Ops/s | $\color{#d91a1a}-0.76\\%$ | | test_sync | 0.1031s | 0.1014s | 9.8587 Ops/s | 9.5309 Ops/s | $\color{#35bf28}+3.44\\%$ | | test_async | 0.2109s | 84.2739ms | 11.8661 Ops/s | 10.3871 Ops/s | $\textbf{\color{#35bf28}+14.24\\%}$ | | test_single_pixels | 0.1278s | 0.1275s | 7.8439 Ops/s | 7.8986 Ops/s | $\color{#d91a1a}-0.69\\%$ | | test_sync_pixels | 85.0236ms | 80.2607ms | 12.4594 Ops/s | 12.4018 Ops/s | $\color{#35bf28}+0.46\\%$ | | test_async_pixels | 0.1548s | 68.2505ms | 14.6519 Ops/s | 14.4284 Ops/s | $\color{#35bf28}+1.55\\%$ | | test_simple | 0.8008s | 0.7999s | 1.2502 Ops/s | 1.2264 Ops/s | $\color{#35bf28}+1.94\\%$ | | test_transformed | 1.0596s | 1.0571s | 0.9460 Ops/s | 0.9385 Ops/s | $\color{#35bf28}+0.79\\%$ | | test_serial | 2.5346s | 2.4786s | 0.4035 Ops/s | 0.4046 Ops/s | $\color{#d91a1a}-0.29\\%$ | | test_parallel | 2.4600s | 2.3741s | 0.4212 Ops/s | 0.4256 Ops/s | $\color{#d91a1a}-1.04\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.1010ms | 33.4555μs | 29.8905 KOps/s | 29.2896 KOps/s | $\color{#35bf28}+2.05\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 40.1330μs | 19.9878μs | 50.0306 KOps/s | 50.1769 KOps/s | $\color{#d91a1a}-0.29\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 0.1349ms | 19.1519μs | 52.2142 KOps/s | 51.3149 KOps/s | $\color{#35bf28}+1.75\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 29.4620μs | 11.5176μs | 86.8235 KOps/s | 87.2700 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 61.8440μs | 35.1827μs | 28.4231 KOps/s | 27.9736 KOps/s | $\color{#35bf28}+1.61\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 46.7030μs | 22.0893μs | 45.2709 KOps/s | 45.8459 KOps/s | $\color{#d91a1a}-1.25\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 0.1023ms | 21.0765μs | 47.4461 KOps/s | 47.7042 KOps/s | $\color{#d91a1a}-0.54\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 31.3720μs | 13.5148μs | 73.9927 KOps/s | 74.3783 KOps/s | $\color{#d91a1a}-0.52\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 64.1340μs | 37.7706μs | 26.4756 KOps/s | 26.3027 KOps/s | $\color{#35bf28}+0.66\\%$ | | test_step_mdp_speed[True-False-True-True-False] | 0.1514ms | 24.1030μs | 41.4886 KOps/s | 41.5306 KOps/s | $\color{#d91a1a}-0.10\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 44.3830μs | 21.1255μs | 47.3361 KOps/s | 47.1854 KOps/s | $\color{#35bf28}+0.32\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 83.8140μs | 13.3618μs | 74.8404 KOps/s | 74.6633 KOps/s | $\color{#35bf28}+0.24\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 68.0930μs | 39.0876μs | 25.5835 KOps/s | 25.3998 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 49.2430μs | 25.5100μs | 39.2002 KOps/s | 39.2992 KOps/s | $\color{#d91a1a}-0.25\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 47.6820μs | 22.8635μs | 43.7379 KOps/s | 43.3924 KOps/s | $\color{#35bf28}+0.80\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 31.8810μs | 15.1842μs | 65.8580 KOps/s | 65.5647 KOps/s | $\color{#35bf28}+0.45\\%$ | | test_step_mdp_speed[False-True-True-True-True] | 56.8930μs | 37.4347μs | 26.7132 KOps/s | 26.4089 KOps/s | $\color{#35bf28}+1.15\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 40.6120μs | 23.7425μs | 42.1185 KOps/s | 41.8205 KOps/s | $\color{#35bf28}+0.71\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 51.8130μs | 24.7354μs | 40.4279 KOps/s | 39.3069 KOps/s | $\color{#35bf28}+2.85\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 45.6530μs | 15.1354μs | 66.0703 KOps/s | 65.2509 KOps/s | $\color{#35bf28}+1.26\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 63.9130μs | 39.2388μs | 25.4850 KOps/s | 25.2994 KOps/s | $\color{#35bf28}+0.73\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 42.3020μs | 25.6981μs | 38.9134 KOps/s | 38.5835 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 54.9430μs | 27.1051μs | 36.8934 KOps/s | 37.0126 KOps/s | $\color{#d91a1a}-0.32\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 67.7840μs | 17.4767μs | 57.2191 KOps/s | 58.8556 KOps/s | $\color{#d91a1a}-2.78\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 71.1940μs | 41.2004μs | 24.2716 KOps/s | 24.1365 KOps/s | $\color{#35bf28}+0.56\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 55.9930μs | 27.9558μs | 35.7708 KOps/s | 35.9098 KOps/s | $\color{#d91a1a}-0.39\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 46.1230μs | 26.6698μs | 37.4955 KOps/s | 36.4989 KOps/s | $\color{#35bf28}+2.73\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 40.4020μs | 17.1529μs | 58.2993 KOps/s | 58.8581 KOps/s | $\color{#d91a1a}-0.95\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 55.1730μs | 42.8731μs | 23.3246 KOps/s | 23.0284 KOps/s | $\color{#35bf28}+1.29\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 52.2220μs | 29.9898μs | 33.3447 KOps/s | 33.8118 KOps/s | $\color{#d91a1a}-1.38\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 48.2230μs | 28.3736μs | 35.2440 KOps/s | 34.5587 KOps/s | $\color{#35bf28}+1.98\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 44.0220μs | 18.8127μs | 53.1556 KOps/s | 53.1100 KOps/s | $\color{#35bf28}+0.09\\%$ | | test_values[generalized_advantage_estimate-True-True] | 25.5266ms | 24.9726ms | 40.0439 Ops/s | 41.4161 Ops/s | $\color{#d91a1a}-3.31\\%$ | | test_values[vec_generalized_advantage_estimate-True-True] | 96.9234ms | 2.8332ms | 352.9531 Ops/s | 368.5639 Ops/s | $\color{#d91a1a}-4.24\\%$ | | test_values[td0_return_estimate-False-False] | 97.7350μs | 65.2003μs | 15.3373 KOps/s | 15.4697 KOps/s | $\color{#d91a1a}-0.86\\%$ | | test_values[td1_return_estimate-False-False] | 55.6692ms | 54.0821ms | 18.4904 Ops/s | 18.5996 Ops/s | $\color{#d91a1a}-0.59\\%$ | | test_values[vec_td1_return_estimate-False-False] | 1.3971ms | 1.0782ms | 927.4375 Ops/s | 929.3813 Ops/s | $\color{#d91a1a}-0.21\\%$ | | test_values[td_lambda_return_estimate-True-False] | 89.4433ms | 85.9742ms | 11.6314 Ops/s | 11.7161 Ops/s | $\color{#d91a1a}-0.72\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 1.4416ms | 1.0775ms | 928.1105 Ops/s | 931.3892 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 24.7953ms | 24.6078ms | 40.6375 Ops/s | 40.8434 Ops/s | $\color{#d91a1a}-0.50\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9260ms | 0.7093ms | 1.4098 KOps/s | 1.4134 KOps/s | $\color{#d91a1a}-0.25\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8209ms | 0.6862ms | 1.4574 KOps/s | 1.5176 KOps/s | $\color{#d91a1a}-3.97\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.6444ms | 1.4625ms | 683.7705 Ops/s | 686.2009 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7058ms | 0.6789ms | 1.4731 KOps/s | 1.4833 KOps/s | $\color{#d91a1a}-0.69\\%$ | | test_dqn_speed | 1.8205ms | 1.4842ms | 673.7522 Ops/s | 679.9730 Ops/s | $\color{#d91a1a}-0.91\\%$ | | test_ddpg_speed | 3.3291ms | 2.9895ms | 334.5052 Ops/s | 337.6228 Ops/s | $\color{#d91a1a}-0.92\\%$ | | test_sac_speed | 8.9559ms | 8.6360ms | 115.7946 Ops/s | 118.0422 Ops/s | $\color{#d91a1a}-1.90\\%$ | | test_redq_speed | 11.0362ms | 10.5798ms | 94.5194 Ops/s | 94.7365 Ops/s | $\color{#d91a1a}-0.23\\%$ | | test_redq_deprec_speed | 11.8976ms | 11.4893ms | 87.0372 Ops/s | 84.8070 Ops/s | $\color{#35bf28}+2.63\\%$ | | test_td3_speed | 8.6529ms | 8.4446ms | 118.4185 Ops/s | 119.2979 Ops/s | $\color{#d91a1a}-0.74\\%$ | | test_cql_speed | 27.4719ms | 26.2154ms | 38.1455 Ops/s | 38.5513 Ops/s | $\color{#d91a1a}-1.05\\%$ | | test_a2c_speed | 6.0864ms | 5.7445ms | 174.0794 Ops/s | 176.4175 Ops/s | $\color{#d91a1a}-1.33\\%$ | | test_ppo_speed | 6.4652ms | 6.0835ms | 164.3783 Ops/s | 167.0778 Ops/s | $\color{#d91a1a}-1.62\\%$ | | test_reinforce_speed | 5.5984ms | 4.6991ms | 212.8047 Ops/s | 213.6308 Ops/s | $\color{#d91a1a}-0.39\\%$ | | test_iql_speed | 20.4532ms | 19.8515ms | 50.3741 Ops/s | 46.5865 Ops/s | $\textbf{\color{#35bf28}+8.13\\%}$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.0815ms | 4.9164ms | 203.4015 Ops/s | 204.0458 Ops/s | $\color{#d91a1a}-0.32\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.1493ms | 0.5945ms | 1.6820 KOps/s | 1.6868 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7089ms | 0.5694ms | 1.7563 KOps/s | 1.7679 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.1553ms | 4.8886ms | 204.5592 Ops/s | 206.5637 Ops/s | $\color{#d91a1a}-0.97\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7709ms | 0.5882ms | 1.7002 KOps/s | 1.7071 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 4.4896ms | 0.5698ms | 1.7549 KOps/s | 1.7879 KOps/s | $\color{#d91a1a}-1.85\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.3155ms | 2.1247ms | 470.6603 Ops/s | 477.2347 Ops/s | $\color{#d91a1a}-1.38\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 5.6731ms | 2.0619ms | 484.9931 Ops/s | 502.2892 Ops/s | $\color{#d91a1a}-3.44\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.1884ms | 5.0512ms | 197.9739 Ops/s | 200.7512 Ops/s | $\color{#d91a1a}-1.38\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9061ms | 0.7221ms | 1.3849 KOps/s | 1.3939 KOps/s | $\color{#d91a1a}-0.65\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8764ms | 0.6931ms | 1.4428 KOps/s | 1.4475 KOps/s | $\color{#d91a1a}-0.32\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.3193ms | 4.8727ms | 205.2248 Ops/s | 205.4396 Ops/s | $\color{#d91a1a}-0.10\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.4084ms | 0.5950ms | 1.6808 KOps/s | 1.6765 KOps/s | $\color{#35bf28}+0.25\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7129ms | 0.5681ms | 1.7602 KOps/s | 1.7394 KOps/s | $\color{#35bf28}+1.20\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.1271ms | 4.8705ms | 205.3197 Ops/s | 206.4564 Ops/s | $\color{#d91a1a}-0.55\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6933ms | 0.5875ms | 1.7021 KOps/s | 1.6979 KOps/s | $\color{#35bf28}+0.25\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 4.3662ms | 0.5693ms | 1.7567 KOps/s | 1.7775 KOps/s | $\color{#d91a1a}-1.17\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.1835ms | 5.0427ms | 198.3067 Ops/s | 199.6355 Ops/s | $\color{#d91a1a}-0.67\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8499ms | 0.7202ms | 1.3886 KOps/s | 1.3897 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8733ms | 0.6968ms | 1.4352 KOps/s | 1.4291 KOps/s | $\color{#35bf28}+0.43\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1174s | 9.4566ms | 105.7463 Ops/s | 137.1759 Ops/s | $\textbf{\color{#d91a1a}-22.91\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 18.9550ms | 16.5080ms | 60.5767 Ops/s | 60.8251 Ops/s | $\color{#d91a1a}-0.41\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.4428ms | 1.2777ms | 782.6304 Ops/s | 742.9536 Ops/s | $\textbf{\color{#35bf28}+5.34\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1042s | 7.2463ms | 138.0018 Ops/s | 109.0255 Ops/s | $\textbf{\color{#35bf28}+26.58\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 18.7389ms | 16.3940ms | 60.9981 Ops/s | 60.7670 Ops/s | $\color{#35bf28}+0.38\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.3102ms | 1.3005ms | 768.9318 Ops/s | 740.6415 Ops/s | $\color{#35bf28}+3.82\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1069s | 9.4676ms | 105.6230 Ops/s | 135.0658 Ops/s | $\textbf{\color{#d91a1a}-21.80\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 18.7894ms | 16.5339ms | 60.4817 Ops/s | 53.5989 Ops/s | $\textbf{\color{#35bf28}+12.84\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.4975ms | 1.4692ms | 680.6510 Ops/s | 663.0095 Ops/s | $\color{#35bf28}+2.66\\%$ |