pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.01k stars 269 forks source link

[BugFix] Fix OOB sampling in PrioritizedSliceSampler #2239

Closed vmoens closed 2 weeks ago

vmoens commented 2 weeks ago

Closes #2230

Unfortunately, I'm not sure how to test this. I guess we should save somewhere a tree structure and mass to replicate the issue (and possibly fix it directly in the c++ code @xiaomengy if you can help with this)

pytorch-bot[bot] commented 2 weeks ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2239

Note: Links to docs will display an error until the docs builds have been completed.

:x: 10 New Failures, 1 Unrelated Failure

As of commit 8438770c8d44addacbd003dbfc46c5619f0db164 with merge base c44a52144d2f2f8e802f460046edafdea2bc24a0 (image):

NEW FAILURES - The following jobs have failed:

* [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466421943) ([gh](https://github.com/pytorch/rl/actions/runs/9597414900/job/26466421943)) `curl: (22) The requested URL returned error:` * [Lint / c-source / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466422527) ([gh](https://github.com/pytorch/rl/actions/runs/9597414892/job/26466422527)) `curl: (22) The requested URL returned error:` * [RLHF Tests on Linux / unittests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466422776) ([gh](https://github.com/pytorch/rl/actions/runs/9597414913/job/26466422776)) `curl: (22) The requested URL returned error:` * [Unit-tests on Linux / tests-cpu (3.10) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466440241) ([gh](https://github.com/pytorch/rl/actions/runs/9597414907/job/26466440241)) `curl: (22) The requested URL returned error:` * [Unit-tests on Linux / tests-cpu (3.11) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466440811) ([gh](https://github.com/pytorch/rl/actions/runs/9597414907/job/26466440811)) `curl: (22) The requested URL returned error:` * [Unit-tests on Linux / tests-gpu (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466439486) ([gh](https://github.com/pytorch/rl/actions/runs/9597414907/job/26466439486)) `RuntimeError: Command docker exec -t f26a4cc8dd373d5749029b5237e7cd05b88c9a2c4e02e5afbc917057507fadc1 /exec failed with exit code 1` * [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466441753) ([gh](https://github.com/pytorch/rl/actions/runs/9597414907/job/26466441753)) `curl: (22) The requested URL returned error:` * [Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466439858) ([gh](https://github.com/pytorch/rl/actions/runs/9597414907/job/26466439858)) `RuntimeError: Command docker exec -t 34b3f0b0792d14ed7fa688b3416a85678eef60d652ff4b37d1b82005546672cb /exec failed with exit code 1` * [Unit-tests on Linux / tests-stable-gpu (3.10, 11.8) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466442113) ([gh](https://github.com/pytorch/rl/actions/runs/9597414907/job/26466442113)) `curl: (22) The requested URL returned error:` * [Unit-tests on Windows / unittests-cpu / windows-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466421742) ([gh](https://github.com/pytorch/rl/actions/runs/9597414911/job/26466421742)) `The process 'C:\Program Files\Git\cmd\git.exe' failed with exit code 128`

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

* [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2239#26466447015) ([gh](https://github.com/pytorch/rl/actions/runs/9597414922/job/26466447015)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/c44a52144d2f2f8e802f460046edafdea2bc24a0#26430824129)) `##[error]The operation was canceled.`

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions[bot] commented 2 weeks ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}38$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_single | 0.1318s | 61.3574ms | 16.2980 Ops/s | 16.9326 Ops/s | $\color{#d91a1a}-3.75\\%$ | | test_sync | 35.0064ms | 32.7731ms | 30.5128 Ops/s | 29.2820 Ops/s | $\color{#35bf28}+4.20\\%$ | | test_async | 59.4381ms | 29.6312ms | 33.7482 Ops/s | 32.8915 Ops/s | $\color{#35bf28}+2.60\\%$ | | test_simple | 0.3799s | 0.3745s | 2.6702 Ops/s | 2.5908 Ops/s | $\color{#35bf28}+3.06\\%$ | | test_transformed | 0.5281s | 0.5236s | 1.9099 Ops/s | 1.8310 Ops/s | $\color{#35bf28}+4.31\\%$ | | test_serial | 1.3106s | 1.2545s | 0.7971 Ops/s | 0.7757 Ops/s | $\color{#35bf28}+2.77\\%$ | | test_parallel | 1.1837s | 1.1210s | 0.8920 Ops/s | 0.9273 Ops/s | $\color{#d91a1a}-3.80\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.2099ms | 22.5320μs | 44.3812 KOps/s | 41.5494 KOps/s | $\textbf{\color{#35bf28}+6.82\\%}$ | | test_step_mdp_speed[True-True-True-True-False] | 72.1450μs | 13.2236μs | 75.6224 KOps/s | 70.1065 KOps/s | $\textbf{\color{#35bf28}+7.87\\%}$ | | test_step_mdp_speed[True-True-True-False-True] | 43.3210μs | 13.1551μs | 76.0160 KOps/s | 71.1943 KOps/s | $\textbf{\color{#35bf28}+6.77\\%}$ | | test_step_mdp_speed[True-True-True-False-False] | 47.7790μs | 7.6469μs | 130.7720 KOps/s | 120.5472 KOps/s | $\textbf{\color{#35bf28}+8.48\\%}$ | | test_step_mdp_speed[True-True-False-True-True] | 61.3650μs | 23.8461μs | 41.9356 KOps/s | 39.2643 KOps/s | $\textbf{\color{#35bf28}+6.80\\%}$ | | test_step_mdp_speed[True-True-False-True-False] | 39.6440μs | 14.5283μs | 68.8310 KOps/s | 63.8122 KOps/s | $\textbf{\color{#35bf28}+7.86\\%}$ | | test_step_mdp_speed[True-True-False-False-True] | 40.3950μs | 14.3965μs | 69.4615 KOps/s | 64.8086 KOps/s | $\textbf{\color{#35bf28}+7.18\\%}$ | | test_step_mdp_speed[True-True-False-False-False] | 32.8810μs | 8.9931μs | 111.1958 KOps/s | 102.8123 KOps/s | $\textbf{\color{#35bf28}+8.15\\%}$ | | test_step_mdp_speed[True-False-True-True-True] | 90.6260μs | 25.3084μs | 39.5126 KOps/s | 37.4335 KOps/s | $\textbf{\color{#35bf28}+5.55\\%}$ | | test_step_mdp_speed[True-False-True-True-False] | 67.9970μs | 16.0123μs | 62.4520 KOps/s | 59.2163 KOps/s | $\textbf{\color{#35bf28}+5.46\\%}$ | | test_step_mdp_speed[True-False-True-False-True] | 83.4960μs | 14.4870μs | 69.0276 KOps/s | 65.9064 KOps/s | $\color{#35bf28}+4.74\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 44.4130μs | 9.0052μs | 111.0466 KOps/s | 105.4339 KOps/s | $\textbf{\color{#35bf28}+5.32\\%}$ | | test_step_mdp_speed[True-False-False-True-True] | 0.1061ms | 26.5565μs | 37.6556 KOps/s | 35.8027 KOps/s | $\textbf{\color{#35bf28}+5.18\\%}$ | | test_step_mdp_speed[True-False-False-True-False] | 80.7010μs | 17.1559μs | 58.2891 KOps/s | 54.3908 KOps/s | $\textbf{\color{#35bf28}+7.17\\%}$ | | test_step_mdp_speed[True-False-False-False-True] | 81.8830μs | 15.9025μs | 62.8833 KOps/s | 60.1860 KOps/s | $\color{#35bf28}+4.48\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 71.6940μs | 10.5006μs | 95.2325 KOps/s | 90.3042 KOps/s | $\textbf{\color{#35bf28}+5.46\\%}$ | | test_step_mdp_speed[False-True-True-True-True] | 63.2090μs | 25.4845μs | 39.2396 KOps/s | 36.5983 KOps/s | $\textbf{\color{#35bf28}+7.22\\%}$ | | test_step_mdp_speed[False-True-True-True-False] | 77.0740μs | 16.0042μs | 62.4836 KOps/s | 58.4742 KOps/s | $\textbf{\color{#35bf28}+6.86\\%}$ | | test_step_mdp_speed[False-True-True-False-True] | 92.7540μs | 16.9160μs | 59.1158 KOps/s | 55.4775 KOps/s | $\textbf{\color{#35bf28}+6.56\\%}$ | | test_step_mdp_speed[False-True-True-False-False] | 47.3980μs | 10.2800μs | 97.2762 KOps/s | 90.5890 KOps/s | $\textbf{\color{#35bf28}+7.38\\%}$ | | test_step_mdp_speed[False-True-False-True-True] | 85.4300μs | 26.6831μs | 37.4768 KOps/s | 35.1872 KOps/s | $\textbf{\color{#35bf28}+6.51\\%}$ | | test_step_mdp_speed[False-True-False-True-False] | 54.4320μs | 17.1072μs | 58.4549 KOps/s | 53.3465 KOps/s | $\textbf{\color{#35bf28}+9.58\\%}$ | | test_step_mdp_speed[False-True-False-False-True] | 94.3960μs | 18.1019μs | 55.2427 KOps/s | 52.6609 KOps/s | $\color{#35bf28}+4.90\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 48.6310μs | 11.5554μs | 86.5398 KOps/s | 81.0884 KOps/s | $\textbf{\color{#35bf28}+6.72\\%}$ | | test_step_mdp_speed[False-False-True-True-True] | 0.1072ms | 27.6719μs | 36.1378 KOps/s | 33.9041 KOps/s | $\textbf{\color{#35bf28}+6.59\\%}$ | | test_step_mdp_speed[False-False-True-True-False] | 61.1740μs | 18.5491μs | 53.9111 KOps/s | 50.2889 KOps/s | $\textbf{\color{#35bf28}+7.20\\%}$ | | test_step_mdp_speed[False-False-True-False-True] | 89.3670μs | 17.9660μs | 55.6607 KOps/s | 52.6724 KOps/s | $\textbf{\color{#35bf28}+5.67\\%}$ | | test_step_mdp_speed[False-False-True-False-False] | 63.1780μs | 11.5454μs | 86.6142 KOps/s | 81.4575 KOps/s | $\textbf{\color{#35bf28}+6.33\\%}$ | | test_step_mdp_speed[False-False-False-True-True] | 69.4500μs | 29.3055μs | 34.1233 KOps/s | 32.2546 KOps/s | $\textbf{\color{#35bf28}+5.79\\%}$ | | test_step_mdp_speed[False-False-False-True-False] | 85.2900μs | 19.5685μs | 51.1026 KOps/s | 47.9100 KOps/s | $\textbf{\color{#35bf28}+6.66\\%}$ | | test_step_mdp_speed[False-False-False-False-True] | 76.2460μs | 19.4593μs | 51.3893 KOps/s | 50.1599 KOps/s | $\color{#35bf28}+2.45\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 47.9490μs | 12.6781μs | 78.8763 KOps/s | 74.5862 KOps/s | $\textbf{\color{#35bf28}+5.75\\%}$ | | test_values[generalized_advantage_estimate-True-True] | 10.3525ms | 10.0144ms | 99.8561 Ops/s | 105.7614 Ops/s | $\textbf{\color{#d91a1a}-5.58\\%}$ | | test_values[vec_generalized_advantage_estimate-True-True] | 39.8261ms | 36.5262ms | 27.3776 Ops/s | 29.9462 Ops/s | $\textbf{\color{#d91a1a}-8.58\\%}$ | | test_values[td0_return_estimate-False-False] | 0.2618ms | 0.1773ms | 5.6404 KOps/s | 6.0439 KOps/s | $\textbf{\color{#d91a1a}-6.68\\%}$ | | test_values[td1_return_estimate-False-False] | 26.6015ms | 23.5810ms | 42.4070 Ops/s | 41.3536 Ops/s | $\color{#35bf28}+2.55\\%$ | | test_values[vec_td1_return_estimate-False-False] | 38.8931ms | 35.7592ms | 27.9648 Ops/s | 29.5976 Ops/s | $\textbf{\color{#d91a1a}-5.52\\%}$ | | test_values[td_lambda_return_estimate-True-False] | 36.6848ms | 33.6384ms | 29.7279 Ops/s | 29.1394 Ops/s | $\color{#35bf28}+2.02\\%$ | | test_values[vec_td_lambda_return_estimate-True-False] | 36.8390ms | 35.7156ms | 27.9990 Ops/s | 29.6182 Ops/s | $\textbf{\color{#d91a1a}-5.47\\%}$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 10.3397ms | 8.2961ms | 120.5386 Ops/s | 120.1403 Ops/s | $\color{#35bf28}+0.33\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.7841ms | 2.0030ms | 499.2595 Ops/s | 555.0875 Ops/s | $\textbf{\color{#d91a1a}-10.06\\%}$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4689ms | 0.3578ms | 2.7946 KOps/s | 2.7795 KOps/s | $\color{#35bf28}+0.55\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 46.2936ms | 44.7260ms | 22.3584 Ops/s | 21.9657 Ops/s | $\color{#35bf28}+1.79\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.6245ms | 3.0302ms | 330.0086 Ops/s | 330.5686 Ops/s | $\color{#d91a1a}-0.17\\%$ | | test_dqn_speed | 6.0949ms | 1.3592ms | 735.7082 Ops/s | 725.5594 Ops/s | $\color{#35bf28}+1.40\\%$ | | test_ddpg_speed | 3.1883ms | 2.8725ms | 348.1230 Ops/s | 344.2989 Ops/s | $\color{#35bf28}+1.11\\%$ | | test_sac_speed | 9.6952ms | 8.4812ms | 117.9079 Ops/s | 111.5963 Ops/s | $\textbf{\color{#35bf28}+5.66\\%}$ | | test_redq_speed | 14.9188ms | 13.3452ms | 74.9333 Ops/s | 71.0143 Ops/s | $\textbf{\color{#35bf28}+5.52\\%}$ | | test_redq_deprec_speed | 97.8080ms | 14.5723ms | 68.6233 Ops/s | 70.6847 Ops/s | $\color{#d91a1a}-2.92\\%$ | | test_td3_speed | 9.4790ms | 8.4525ms | 118.3079 Ops/s | 110.1768 Ops/s | $\textbf{\color{#35bf28}+7.38\\%}$ | | test_cql_speed | 37.7974ms | 36.8489ms | 27.1379 Ops/s | 26.1088 Ops/s | $\color{#35bf28}+3.94\\%$ | | test_a2c_speed | 16.3268ms | 7.6418ms | 130.8588 Ops/s | 100.8662 Ops/s | $\textbf{\color{#35bf28}+29.74\\%}$ | | test_ppo_speed | 8.2877ms | 7.7053ms | 129.7815 Ops/s | 104.6806 Ops/s | $\textbf{\color{#35bf28}+23.98\\%}$ | | test_reinforce_speed | 10.1887ms | 7.6603ms | 130.5437 Ops/s | 140.1846 Ops/s | $\textbf{\color{#d91a1a}-6.88\\%}$ | | test_iql_speed | 36.9033ms | 34.2525ms | 29.1949 Ops/s | 28.4949 Ops/s | $\color{#35bf28}+2.46\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.4699ms | 4.4973ms | 222.3573 Ops/s | 240.1035 Ops/s | $\textbf{\color{#d91a1a}-7.39\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.2462ms | 0.5456ms | 1.8329 KOps/s | 1.8397 KOps/s | $\color{#d91a1a}-0.37\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 1.1224ms | 0.5082ms | 1.9679 KOps/s | 1.9101 KOps/s | $\color{#35bf28}+3.02\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.8123ms | 4.4207ms | 226.2107 Ops/s | 242.5904 Ops/s | $\textbf{\color{#d91a1a}-6.75\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.0394ms | 0.5171ms | 1.9340 KOps/s | 1.8601 KOps/s | $\color{#35bf28}+3.97\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6572ms | 0.5049ms | 1.9808 KOps/s | 1.9919 KOps/s | $\color{#d91a1a}-0.56\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.5390ms | 1.8478ms | 541.1796 Ops/s | 559.5705 Ops/s | $\color{#d91a1a}-3.29\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 2.4120ms | 1.7763ms | 562.9802 Ops/s | 591.2421 Ops/s | $\color{#d91a1a}-4.78\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.3627ms | 4.6374ms | 215.6358 Ops/s | 241.7700 Ops/s | $\textbf{\color{#d91a1a}-10.81\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2522ms | 0.6551ms | 1.5265 KOps/s | 1.2964 KOps/s | $\textbf{\color{#35bf28}+17.75\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.0219ms | 0.6247ms | 1.6007 KOps/s | 1.7071 KOps/s | $\textbf{\color{#d91a1a}-6.23\\%}$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.7352ms | 4.1706ms | 239.7747 Ops/s | 287.3745 Ops/s | $\textbf{\color{#d91a1a}-16.56\\%}$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7467ms | 0.5339ms | 1.8730 KOps/s | 1.9841 KOps/s | $\textbf{\color{#d91a1a}-5.60\\%}$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 4.0364ms | 0.5114ms | 1.9554 KOps/s | 2.1072 KOps/s | $\textbf{\color{#d91a1a}-7.20\\%}$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.2310ms | 3.9335ms | 254.2257 Ops/s | 251.0100 Ops/s | $\color{#35bf28}+1.28\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6261ms | 0.5165ms | 1.9360 KOps/s | 1.8515 KOps/s | $\color{#35bf28}+4.56\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6813ms | 0.4827ms | 2.0715 KOps/s | 1.9645 KOps/s | $\textbf{\color{#35bf28}+5.44\\%}$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.7724ms | 3.9754ms | 251.5481 Ops/s | 239.8997 Ops/s | $\color{#35bf28}+4.86\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1717ms | 0.6706ms | 1.4913 KOps/s | 1.5413 KOps/s | $\color{#d91a1a}-3.24\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9364ms | 0.6367ms | 1.5706 KOps/s | 1.5952 KOps/s | $\color{#d91a1a}-1.54\\%$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1319s | 6.5644ms | 152.3363 Ops/s | 148.1597 Ops/s | $\color{#35bf28}+2.82\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 15.5556ms | 12.9011ms | 77.5131 Ops/s | 76.5619 Ops/s | $\color{#35bf28}+1.24\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 0.1087s | 3.2951ms | 303.4839 Ops/s | 927.0188 Ops/s | $\textbf{\color{#d91a1a}-67.26\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1118s | 6.0473ms | 165.3626 Ops/s | 111.6263 Ops/s | $\textbf{\color{#35bf28}+48.14\\%}$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 17.2366ms | 12.9756ms | 77.0676 Ops/s | 75.7753 Ops/s | $\color{#35bf28}+1.71\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 3.6774ms | 1.1533ms | 867.0482 Ops/s | 851.1034 Ops/s | $\color{#35bf28}+1.87\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1102s | 6.1304ms | 163.1206 Ops/s | 148.5676 Ops/s | $\textbf{\color{#35bf28}+9.80\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 15.1564ms | 12.6009ms | 79.3597 Ops/s | 74.7673 Ops/s | $\textbf{\color{#35bf28}+6.14\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.9217ms | 1.2391ms | 807.0114 Ops/s | 776.8816 Ops/s | $\color{#35bf28}+3.88\\%$ |
github-actions[bot] commented 2 weeks ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ----------------------------------------------------------------------------------------- | --------- | --------- | -------------- | ------------------ | ----------------------------------- | | test_single | 0.1146s | 0.1141s | 8.7609 Ops/s | 8.8598 Ops/s | $\color{#d91a1a}-1.12\\%$ | | test_sync | 0.1050s | 0.1024s | 9.7638 Ops/s | 9.6400 Ops/s | $\color{#35bf28}+1.28\\%$ | | test_async | 0.1954s | 78.8844ms | 12.6768 Ops/s | 10.1433 Ops/s | $\textbf{\color{#35bf28}+24.98\\%}$ | | test_single_pixels | 0.1237s | 0.1225s | 8.1666 Ops/s | 8.1653 Ops/s | $\color{#35bf28}+0.02\\%$ | | test_sync_pixels | 83.0279ms | 80.8358ms | 12.3708 Ops/s | 12.8136 Ops/s | $\color{#d91a1a}-3.46\\%$ | | test_async_pixels | 0.1537s | 66.9532ms | 14.9358 Ops/s | 14.9429 Ops/s | $\color{#d91a1a}-0.05\\%$ | | test_simple | 0.8145s | 0.8040s | 1.2438 Ops/s | 1.2811 Ops/s | $\color{#d91a1a}-2.91\\%$ | | test_transformed | 1.0541s | 1.0418s | 0.9599 Ops/s | 0.9665 Ops/s | $\color{#d91a1a}-0.68\\%$ | | test_serial | 2.5016s | 2.4550s | 0.4073 Ops/s | 0.4172 Ops/s | $\color{#d91a1a}-2.36\\%$ | | test_parallel | 2.4788s | 2.3616s | 0.4234 Ops/s | 0.4189 Ops/s | $\color{#35bf28}+1.09\\%$ | | test_step_mdp_speed[True-True-True-True-True] | 0.2148ms | 31.5189μs | 31.7270 KOps/s | 32.8317 KOps/s | $\color{#d91a1a}-3.36\\%$ | | test_step_mdp_speed[True-True-True-True-False] | 0.1362ms | 18.4105μs | 54.3169 KOps/s | 56.1455 KOps/s | $\color{#d91a1a}-3.26\\%$ | | test_step_mdp_speed[True-True-True-False-True] | 45.7910μs | 17.7642μs | 56.2930 KOps/s | 56.5903 KOps/s | $\color{#d91a1a}-0.53\\%$ | | test_step_mdp_speed[True-True-True-False-False] | 43.6210μs | 10.4045μs | 96.1127 KOps/s | 97.9790 KOps/s | $\color{#d91a1a}-1.90\\%$ | | test_step_mdp_speed[True-True-False-True-True] | 66.6210μs | 33.2382μs | 30.0859 KOps/s | 31.2370 KOps/s | $\color{#d91a1a}-3.69\\%$ | | test_step_mdp_speed[True-True-False-True-False] | 49.8300μs | 19.8973μs | 50.2580 KOps/s | 52.3626 KOps/s | $\color{#d91a1a}-4.02\\%$ | | test_step_mdp_speed[True-True-False-False-True] | 52.7310μs | 19.3006μs | 51.8119 KOps/s | 52.1085 KOps/s | $\color{#d91a1a}-0.57\\%$ | | test_step_mdp_speed[True-True-False-False-False] | 39.7210μs | 12.1641μs | 82.2089 KOps/s | 85.6134 KOps/s | $\color{#d91a1a}-3.98\\%$ | | test_step_mdp_speed[True-False-True-True-True] | 65.3110μs | 35.0052μs | 28.5672 KOps/s | 30.1076 KOps/s | $\textbf{\color{#d91a1a}-5.12\\%}$ | | test_step_mdp_speed[True-False-True-True-False] | 55.1010μs | 21.6957μs | 46.0921 KOps/s | 47.5605 KOps/s | $\color{#d91a1a}-3.09\\%$ | | test_step_mdp_speed[True-False-True-False-True] | 45.1810μs | 19.4560μs | 51.3980 KOps/s | 51.8150 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_step_mdp_speed[True-False-True-False-False] | 37.0210μs | 12.0570μs | 82.9396 KOps/s | 85.1394 KOps/s | $\color{#d91a1a}-2.58\\%$ | | test_step_mdp_speed[True-False-False-True-True] | 0.1645ms | 36.0397μs | 27.7472 KOps/s | 28.1881 KOps/s | $\color{#d91a1a}-1.56\\%$ | | test_step_mdp_speed[True-False-False-True-False] | 60.8320μs | 23.1867μs | 43.1282 KOps/s | 44.7624 KOps/s | $\color{#d91a1a}-3.65\\%$ | | test_step_mdp_speed[True-False-False-False-True] | 68.4010μs | 20.9347μs | 47.7676 KOps/s | 48.8914 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_step_mdp_speed[True-False-False-False-False] | 42.9600μs | 13.7024μs | 72.9800 KOps/s | 78.3969 KOps/s | $\textbf{\color{#d91a1a}-6.91\\%}$ | | test_step_mdp_speed[False-True-True-True-True] | 0.2301ms | 34.7105μs | 28.8097 KOps/s | 29.3608 KOps/s | $\color{#d91a1a}-1.88\\%$ | | test_step_mdp_speed[False-True-True-True-False] | 0.2034ms | 21.2543μs | 47.0494 KOps/s | 48.4139 KOps/s | $\color{#d91a1a}-2.82\\%$ | | test_step_mdp_speed[False-True-True-False-True] | 0.2192ms | 22.7094μs | 44.0347 KOps/s | 45.1929 KOps/s | $\color{#d91a1a}-2.56\\%$ | | test_step_mdp_speed[False-True-True-False-False] | 90.4820μs | 13.8338μs | 72.2865 KOps/s | 73.8764 KOps/s | $\color{#d91a1a}-2.15\\%$ | | test_step_mdp_speed[False-True-False-True-True] | 72.7220μs | 35.8647μs | 27.8826 KOps/s | 27.8742 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_step_mdp_speed[False-True-False-True-False] | 64.2710μs | 22.8982μs | 43.6715 KOps/s | 44.1502 KOps/s | $\color{#d91a1a}-1.08\\%$ | | test_step_mdp_speed[False-True-False-False-True] | 56.3910μs | 24.1067μs | 41.4822 KOps/s | 41.3019 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_step_mdp_speed[False-True-False-False-False] | 42.5210μs | 15.3523μs | 65.1368 KOps/s | 66.0612 KOps/s | $\color{#d91a1a}-1.40\\%$ | | test_step_mdp_speed[False-False-True-True-True] | 68.7420μs | 37.4364μs | 26.7120 KOps/s | 27.5805 KOps/s | $\color{#d91a1a}-3.15\\%$ | | test_step_mdp_speed[False-False-True-True-False] | 0.1264ms | 24.6004μs | 40.6497 KOps/s | 41.6306 KOps/s | $\color{#d91a1a}-2.36\\%$ | | test_step_mdp_speed[False-False-True-False-True] | 53.5110μs | 23.7107μs | 42.1750 KOps/s | 42.0483 KOps/s | $\color{#35bf28}+0.30\\%$ | | test_step_mdp_speed[False-False-True-False-False] | 41.3310μs | 15.3868μs | 64.9907 KOps/s | 66.7393 KOps/s | $\color{#d91a1a}-2.62\\%$ | | test_step_mdp_speed[False-False-False-True-True] | 55.9310μs | 39.5743μs | 25.2689 KOps/s | 25.0057 KOps/s | $\color{#35bf28}+1.05\\%$ | | test_step_mdp_speed[False-False-False-True-False] | 75.0720μs | 26.4817μs | 37.7619 KOps/s | 38.3279 KOps/s | $\color{#d91a1a}-1.48\\%$ | | test_step_mdp_speed[False-False-False-False-True] | 0.1359ms | 25.4587μs | 39.2792 KOps/s | 39.5849 KOps/s | $\color{#d91a1a}-0.77\\%$ | | test_step_mdp_speed[False-False-False-False-False] | 41.2710μs | 16.5601μs | 60.3861 KOps/s | 60.6911 KOps/s | $\color{#d91a1a}-0.50\\%$ | | test_values[generalized_advantage_estimate-True-True] | 27.2530ms | 26.6300ms | 37.5516 Ops/s | 35.7470 Ops/s | $\textbf{\color{#35bf28}+5.05\\%}$ | | test_values[vec_generalized_advantage_estimate-True-True] | 0.1090s | 3.0995ms | 322.6284 Ops/s | 340.3571 Ops/s | $\textbf{\color{#d91a1a}-5.21\\%}$ | | test_values[td0_return_estimate-False-False] | 93.3910μs | 66.4304μs | 15.0533 KOps/s | 14.8726 KOps/s | $\color{#35bf28}+1.22\\%$ | | test_values[td1_return_estimate-False-False] | 67.1181ms | 58.7469ms | 17.0222 Ops/s | 16.3834 Ops/s | $\color{#35bf28}+3.90\\%$ | | test_values[vec_td1_return_estimate-False-False] | 1.4245ms | 1.1122ms | 899.1033 Ops/s | 894.3683 Ops/s | $\color{#35bf28}+0.53\\%$ | | test_values[td_lambda_return_estimate-True-False] | 96.4680ms | 92.9338ms | 10.7603 Ops/s | 10.2169 Ops/s | $\textbf{\color{#35bf28}+5.32\\%}$ | | test_values[vec_td_lambda_return_estimate-True-False] | 1.4414ms | 1.1072ms | 903.1894 Ops/s | 883.9709 Ops/s | $\color{#35bf28}+2.17\\%$ | | test_gae_speed[generalized_advantage_estimate-False-1-512] | 28.3139ms | 27.8611ms | 35.8923 Ops/s | 35.8777 Ops/s | $\color{#35bf28}+0.04\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9927ms | 0.7767ms | 1.2874 KOps/s | 1.3154 KOps/s | $\color{#d91a1a}-2.13\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8801ms | 0.7149ms | 1.3988 KOps/s | 1.4317 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.6870ms | 1.5199ms | 657.9285 Ops/s | 660.8648 Ops/s | $\color{#d91a1a}-0.44\\%$ | | test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.9274ms | 0.7345ms | 1.3614 KOps/s | 1.3527 KOps/s | $\color{#35bf28}+0.65\\%$ | | test_dqn_speed | 80.1057ms | 1.5703ms | 636.8400 Ops/s | 695.7407 Ops/s | $\textbf{\color{#d91a1a}-8.47\\%}$ | | test_ddpg_speed | 3.2956ms | 2.9262ms | 341.7376 Ops/s | 347.7255 Ops/s | $\color{#d91a1a}-1.72\\%$ | | test_sac_speed | 8.9392ms | 8.4231ms | 118.7217 Ops/s | 120.6895 Ops/s | $\color{#d91a1a}-1.63\\%$ | | test_redq_speed | 17.6406ms | 11.0625ms | 90.3952 Ops/s | 93.5945 Ops/s | $\color{#d91a1a}-3.42\\%$ | | test_redq_deprec_speed | 12.1435ms | 11.6558ms | 85.7943 Ops/s | 86.5440 Ops/s | $\color{#d91a1a}-0.87\\%$ | | test_td3_speed | 18.0573ms | 8.4058ms | 118.9659 Ops/s | 120.2432 Ops/s | $\color{#d91a1a}-1.06\\%$ | | test_cql_speed | 29.5751ms | 26.3069ms | 38.0129 Ops/s | 38.2363 Ops/s | $\color{#d91a1a}-0.58\\%$ | | test_a2c_speed | 6.0594ms | 5.7284ms | 174.5683 Ops/s | 175.5867 Ops/s | $\color{#d91a1a}-0.58\\%$ | | test_ppo_speed | 6.7681ms | 6.0861ms | 164.3078 Ops/s | 164.8776 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_reinforce_speed | 5.0840ms | 4.6373ms | 215.6425 Ops/s | 215.1587 Ops/s | $\color{#35bf28}+0.22\\%$ | | test_iql_speed | 20.7466ms | 20.1474ms | 49.6341 Ops/s | 49.4397 Ops/s | $\color{#35bf28}+0.39\\%$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.6603ms | 4.4533ms | 224.5544 Ops/s | 221.4917 Ops/s | $\color{#35bf28}+1.38\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.3406ms | 0.4111ms | 2.4325 KOps/s | 3.1797 KOps/s | $\textbf{\color{#d91a1a}-23.50\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6138ms | 0.3891ms | 2.5701 KOps/s | 3.4534 KOps/s | $\textbf{\color{#d91a1a}-25.58\\%}$ | | test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.7274ms | 4.4613ms | 224.1509 Ops/s | 223.4503 Ops/s | $\color{#35bf28}+0.31\\%$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6114ms | 0.4054ms | 2.4670 KOps/s | 3.2356 KOps/s | $\textbf{\color{#d91a1a}-23.76\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 9.4202ms | 0.3823ms | 2.6159 KOps/s | 3.5111 KOps/s | $\textbf{\color{#d91a1a}-25.50\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.9007ms | 1.7300ms | 578.0181 Ops/s | 644.5667 Ops/s | $\textbf{\color{#d91a1a}-10.32\\%}$ | | test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.8988ms | 1.6361ms | 611.1935 Ops/s | 686.8405 Ops/s | $\textbf{\color{#d91a1a}-11.01\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.7567ms | 4.5912ms | 217.8066 Ops/s | 217.1305 Ops/s | $\color{#35bf28}+0.31\\%$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.7066ms | 0.5377ms | 1.8597 KOps/s | 2.0697 KOps/s | $\textbf{\color{#d91a1a}-10.14\\%}$ | | test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 9.6423ms | 0.4530ms | 2.2073 KOps/s | 2.1376 KOps/s | $\color{#35bf28}+3.26\\%$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.0556ms | 4.4922ms | 222.6071 Ops/s | 223.1150 Ops/s | $\color{#d91a1a}-0.23\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.4986ms | 0.3205ms | 3.1203 KOps/s | 3.1719 KOps/s | $\color{#d91a1a}-1.63\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 9.4910ms | 0.3232ms | 3.0943 KOps/s | 3.4296 KOps/s | $\textbf{\color{#d91a1a}-9.78\\%}$ | | test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.7095ms | 4.4579ms | 224.3212 Ops/s | 224.9549 Ops/s | $\color{#d91a1a}-0.28\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.5777ms | 0.3907ms | 2.5593 KOps/s | 2.4500 KOps/s | $\color{#35bf28}+4.46\\%$ | | test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 9.5799ms | 0.3003ms | 3.3300 KOps/s | 3.4433 KOps/s | $\color{#d91a1a}-3.29\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.8066ms | 4.6089ms | 216.9702 Ops/s | 217.2290 Ops/s | $\color{#d91a1a}-0.12\\%$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.3186ms | 0.5533ms | 1.8074 KOps/s | 2.2691 KOps/s | $\textbf{\color{#d91a1a}-20.35\\%}$ | | test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7507ms | 0.5356ms | 1.8671 KOps/s | 2.4303 KOps/s | $\textbf{\color{#d91a1a}-23.17\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1448s | 8.0295ms | 124.5402 Ops/s | 125.7742 Ops/s | $\color{#d91a1a}-0.98\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 20.7034ms | 16.0327ms | 62.3725 Ops/s | 62.4076 Ops/s | $\color{#d91a1a}-0.06\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.0354ms | 0.9246ms | 1.0815 KOps/s | 840.8454 Ops/s | $\textbf{\color{#35bf28}+28.62\\%}$ | | test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1230s | 7.6263ms | 131.1246 Ops/s | 130.4628 Ops/s | $\color{#35bf28}+0.51\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1341s | 18.3623ms | 54.4593 Ops/s | 54.5766 Ops/s | $\color{#d91a1a}-0.22\\%$ | | test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 8.2470ms | 1.0252ms | 975.4576 Ops/s | 1.0462 KOps/s | $\textbf{\color{#d91a1a}-6.76\\%}$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1249s | 7.8202ms | 127.8737 Ops/s | 127.7830 Ops/s | $\color{#35bf28}+0.07\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 20.9688ms | 16.0789ms | 62.1935 Ops/s | 62.3377 Ops/s | $\color{#d91a1a}-0.23\\%$ | | test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 7.6962ms | 1.2222ms | 818.1979 Ops/s | 892.2030 Ops/s | $\textbf{\color{#d91a1a}-8.29\\%}$ |