issues
search
pytorch
/
tensordict
TensorDict is a pytorch dedicated tensor container.
MIT License
832
stars
74
forks
source link
[BugFix] Fix torch version assertion
#917
Closed
vmoens
closed
3 months ago
github-actions[bot]
commented
3 months ago
$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests
Total Benchmarks: 213. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}8$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ------------------------------------ | | test_plain_set_nested | 41.2170μs | 21.4924μs | 46.5280 KOps/s | 44.2512 KOps/s | $\textbf{\color{#35bf28}+5.15\\%}$ | | test_plain_set_stack_nested | 58.4700μs | 22.0067μs | 45.4406 KOps/s | 43.3144 KOps/s | $\color{#35bf28}+4.91\\%$ | | test_plain_set_nested_inplace | 83.6460μs | 23.5865μs | 42.3972 KOps/s | 40.2580 KOps/s | $\textbf{\color{#35bf28}+5.31\\%}$ | | test_plain_set_stack_nested_inplace | 0.1141ms | 23.7099μs | 42.1765 KOps/s | 40.0652 KOps/s | $\textbf{\color{#35bf28}+5.27\\%}$ | | test_items | 18.5640μs | 2.7519μs | 363.3841 KOps/s | 389.0732 KOps/s | $\textbf{\color{#d91a1a}-6.60\\%}$ | | test_items_nested | 0.5662ms | 0.3297ms | 3.0327 KOps/s | 2.9302 KOps/s | $\color{#35bf28}+3.50\\%$ | | test_items_nested_locked | 2.2606ms | 0.3304ms | 3.0266 KOps/s | 2.9691 KOps/s | $\color{#35bf28}+1.94\\%$ | | test_items_nested_leaf | 0.1340ms | 86.1337μs | 11.6099 KOps/s | 11.6558 KOps/s | $\color{#d91a1a}-0.39\\%$ | | test_items_stack_nested | 0.4164ms | 0.3321ms | 3.0108 KOps/s | 2.9364 KOps/s | $\color{#35bf28}+2.53\\%$ | | test_items_stack_nested_leaf | 0.1908ms | 87.5155μs | 11.4265 KOps/s | 11.5394 KOps/s | $\color{#d91a1a}-0.98\\%$ | | test_items_stack_nested_locked | 0.3890ms | 0.3307ms | 3.0237 KOps/s | 2.9250 KOps/s | $\color{#35bf28}+3.37\\%$ | | test_keys | 30.3870μs | 3.9314μs | 254.3651 KOps/s | 252.5564 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_keys_nested | 0.2373ms | 0.1428ms | 7.0014 KOps/s | 6.8431 KOps/s | $\color{#35bf28}+2.31\\%$ | | test_keys_nested_locked | 0.7869ms | 0.1495ms | 6.6906 KOps/s | 6.6089 KOps/s | $\color{#35bf28}+1.24\\%$ | | test_keys_nested_leaf | 0.1736ms | 0.1239ms | 8.0679 KOps/s | 7.8972 KOps/s | $\color{#35bf28}+2.16\\%$ | | test_keys_stack_nested | 0.3247ms | 0.1451ms | 6.8934 KOps/s | 6.8878 KOps/s | $\color{#35bf28}+0.08\\%$ | | test_keys_stack_nested_leaf | 0.1728ms | 0.1232ms | 8.1144 KOps/s | 7.9458 KOps/s | $\color{#35bf28}+2.12\\%$ | | test_keys_stack_nested_locked | 0.2180ms | 0.1484ms | 6.7366 KOps/s | 6.5498 KOps/s | $\color{#35bf28}+2.85\\%$ | | test_values | 11.0230μs | 1.1716μs | 853.5013 KOps/s | 852.6122 KOps/s | $\color{#35bf28}+0.10\\%$ | | test_values_nested | 95.8980μs | 50.2474μs | 19.9015 KOps/s | 19.4143 KOps/s | $\color{#35bf28}+2.51\\%$ | | test_values_nested_locked | 89.5470μs | 50.4701μs | 19.8137 KOps/s | 19.3435 KOps/s | $\color{#35bf28}+2.43\\%$ | | test_values_nested_leaf | 0.1099ms | 45.1170μs | 22.1646 KOps/s | 21.3900 KOps/s | $\color{#35bf28}+3.62\\%$ | | test_values_stack_nested | 0.1038ms | 50.3877μs | 19.8461 KOps/s | 18.9862 KOps/s | $\color{#35bf28}+4.53\\%$ | | test_values_stack_nested_leaf | 93.3140μs | 45.2078μs | 22.1201 KOps/s | 21.1815 KOps/s | $\color{#35bf28}+4.43\\%$ | | test_values_stack_nested_locked | 0.1053ms | 50.6045μs | 19.7611 KOps/s | 18.7754 KOps/s | $\textbf{\color{#35bf28}+5.25\\%}$ | | test_membership | 4.5600μs | 0.7590μs | 1.3176 MOps/s | 1.0855 MOps/s | $\textbf{\color{#35bf28}+21.39\\%}$ | | test_membership_nested | 28.6030μs | 2.6358μs | 379.3888 KOps/s | 385.3165 KOps/s | $\color{#d91a1a}-1.54\\%$ | | test_membership_nested_leaf | 31.5190μs | 2.6432μs | 378.3307 KOps/s | 380.4242 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_membership_stacked_nested | 21.6010μs | 2.5884μs | 386.3370 KOps/s | 384.8389 KOps/s | $\color{#35bf28}+0.39\\%$ | | test_membership_stacked_nested_leaf | 25.9080μs | 2.6288μs | 380.3976 KOps/s | 383.2666 KOps/s | $\color{#d91a1a}-0.75\\%$ | | test_membership_nested_last | 36.7380μs | 3.9496μs | 253.1918 KOps/s | 252.1220 KOps/s | $\color{#35bf28}+0.42\\%$ | | test_membership_nested_leaf_last | 44.1530μs | 3.8999μs | 256.4190 KOps/s | 253.0128 KOps/s | $\color{#35bf28}+1.35\\%$ | | test_membership_stacked_nested_last | 31.8190μs | 3.9189μs | 255.1734 KOps/s | 256.7292 KOps/s | $\color{#d91a1a}-0.61\\%$ | | test_membership_stacked_nested_leaf_last | 49.2920μs | 3.9262μs | 254.6965 KOps/s | 254.3001 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_nested_getleaf | 61.1960μs | 10.6042μs | 94.3026 KOps/s | 96.2290 KOps/s | $\color{#d91a1a}-2.00\\%$ | | test_nested_get | 46.3340μs | 9.8185μs | 101.8486 KOps/s | 99.3009 KOps/s | $\color{#35bf28}+2.57\\%$ | | test_stacked_getleaf | 56.7460μs | 10.4838μs | 95.3850 KOps/s | 94.8079 KOps/s | $\color{#35bf28}+0.61\\%$ | | test_stacked_get | 55.2330μs | 10.0773μs | 99.2331 KOps/s | 99.9019 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_nested_getitemleaf | 40.5860μs | 11.1112μs | 89.9997 KOps/s | 89.0682 KOps/s | $\color{#35bf28}+1.05\\%$ | | test_nested_getitem | 51.5160μs | 10.2183μs | 97.8636 KOps/s | 97.1264 KOps/s | $\color{#35bf28}+0.76\\%$ | | test_stacked_getitemleaf | 41.0770μs | 11.1816μs | 89.4324 KOps/s | 90.4551 KOps/s | $\color{#d91a1a}-1.13\\%$ | | test_stacked_getitem | 41.3170μs | 10.3047μs | 97.0427 KOps/s | 97.1546 KOps/s | $\color{#d91a1a}-0.12\\%$ | | test_lock_nested | 6.9480ms | 0.5039ms | 1.9844 KOps/s | 1.9910 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_lock_stack_nested | 0.8795ms | 0.4705ms | 2.1254 KOps/s | 2.1393 KOps/s | $\color{#d91a1a}-0.65\\%$ | | test_unlock_nested | 85.3132ms | 0.5012ms | 1.9954 KOps/s | 2.4122 KOps/s | $\textbf{\color{#d91a1a}-17.28\\%}$ | | test_unlock_stack_nested | 0.8081ms | 0.3859ms | 2.5916 KOps/s | 2.6417 KOps/s | $\color{#d91a1a}-1.90\\%$ | | test_flatten_speed | 0.5983ms | 0.1075ms | 9.3004 KOps/s | 9.5763 KOps/s | $\color{#d91a1a}-2.88\\%$ | | test_unflatten_speed | 0.7538ms | 0.4363ms | 2.2922 KOps/s | 2.3106 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_common_ops | 4.6816ms | 1.0870ms | 919.9313 Ops/s | 899.4513 Ops/s | $\color{#35bf28}+2.28\\%$ | | test_creation | 18.4950μs | 2.0847μs | 479.6856 KOps/s | 452.8328 KOps/s | $\textbf{\color{#35bf28}+5.93\\%}$ | | test_creation_empty | 90.9220μs | 18.0888μs | 55.2828 KOps/s | 51.5838 KOps/s | $\textbf{\color{#35bf28}+7.17\\%}$ | | test_creation_nested_1 | 62.9080μs | 21.2880μs | 46.9747 KOps/s | 43.9787 KOps/s | $\textbf{\color{#35bf28}+6.81\\%}$ | | test_creation_nested_2 | 57.0560μs | 24.6609μs | 40.5501 KOps/s | 38.3141 KOps/s | $\textbf{\color{#35bf28}+5.84\\%}$ | | test_clone | 0.1168ms | 16.7515μs | 59.6963 KOps/s | 61.8243 KOps/s | $\color{#d91a1a}-3.44\\%$ | | test_getitem[int] | 1.2669ms | 16.5975μs | 60.2502 KOps/s | 61.2573 KOps/s | $\color{#d91a1a}-1.64\\%$ | | test_getitem[slice_int] | 0.1426ms | 32.0203μs | 31.2301 KOps/s | 31.4948 KOps/s | $\color{#d91a1a}-0.84\\%$ | | test_getitem[range] | 0.2159ms | 57.3566μs | 17.4348 KOps/s | 17.8454 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_getitem[tuple] | 0.1443ms | 25.6297μs | 39.0172 KOps/s | 40.1289 KOps/s | $\color{#d91a1a}-2.77\\%$ | | test_getitem[list] | 0.2570ms | 51.7408μs | 19.3271 KOps/s | 19.4574 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_setitem_dim[int] | 84.5870μs | 40.5507μs | 24.6605 KOps/s | 24.6699 KOps/s | $\color{#d91a1a}-0.04\\%$ | | test_setitem_dim[slice_int] | 0.1162ms | 72.3974μs | 13.8127 KOps/s | 13.9764 KOps/s | $\color{#d91a1a}-1.17\\%$ | | test_setitem_dim[range] | 0.2020ms | 93.3373μs | 10.7138 KOps/s | 10.8022 KOps/s | $\color{#d91a1a}-0.82\\%$ | | test_setitem_dim[tuple] | 0.1063ms | 59.1859μs | 16.8959 KOps/s | 17.2583 KOps/s | $\color{#d91a1a}-2.10\\%$ | | test_setitem | 0.1476ms | 29.1975μs | 34.2496 KOps/s | 34.2815 KOps/s | $\color{#d91a1a}-0.09\\%$ | | test_set | 0.1238ms | 28.3114μs | 35.3215 KOps/s | 34.6077 KOps/s | $\color{#35bf28}+2.06\\%$ | | test_set_shared | 4.4089ms | 0.2187ms | 4.5731 KOps/s | 4.6562 KOps/s | $\color{#d91a1a}-1.79\\%$ | | test_update | 0.1925ms | 35.2678μs | 28.3544 KOps/s | 27.0858 KOps/s | $\color{#35bf28}+4.68\\%$ | | test_update_nested | 0.1502ms | 45.1782μs | 22.1346 KOps/s | 21.5300 KOps/s | $\color{#35bf28}+2.81\\%$ | | test_update__nested | 0.1425ms | 33.3727μs | 29.9646 KOps/s | 30.3759 KOps/s | $\color{#d91a1a}-1.35\\%$ | | test_set_nested | 0.1054ms | 30.7397μs | 32.5313 KOps/s | 32.0279 KOps/s | $\color{#35bf28}+1.57\\%$ | | test_set_nested_new | 0.1563ms | 35.8121μs | 27.9236 KOps/s | 27.6035 KOps/s | $\color{#35bf28}+1.16\\%$ | | test_select | 1.0924ms | 52.4995μs | 19.0478 KOps/s | 19.2770 KOps/s | $\color{#d91a1a}-1.19\\%$ | | test_select_nested | 0.1400ms | 59.4332μs | 16.8256 KOps/s | 17.0086 KOps/s | $\color{#d91a1a}-1.08\\%$ | | test_exclude_nested | 0.1313ms | 76.9193μs | 13.0006 KOps/s | 12.9380 KOps/s | $\color{#35bf28}+0.48\\%$ | | test_empty[True] | 0.9096ms | 0.3242ms | 3.0849 KOps/s | 3.0973 KOps/s | $\color{#d91a1a}-0.40\\%$ | | test_empty[False] | 32.4500μs | 1.3325μs | 750.4414 KOps/s | 847.8068 KOps/s | $\textbf{\color{#d91a1a}-11.48\\%}$ | | test_unbind_speed | 0.3702ms | 0.3065ms | 3.2629 KOps/s | 3.2744 KOps/s | $\color{#d91a1a}-0.35\\%$ | | test_unbind_speed_stack0 | 0.4355ms | 0.3031ms | 3.2997 KOps/s | 3.3594 KOps/s | $\color{#d91a1a}-1.78\\%$ | | test_unbind_speed_stack1 | 89.3736ms | 0.7958ms | 1.2565 KOps/s | 1.3851 KOps/s | $\textbf{\color{#d91a1a}-9.28\\%}$ | | test_split | 88.2951ms | 2.1306ms | 469.3446 Ops/s | 472.8040 Ops/s | $\color{#d91a1a}-0.73\\%$ | | test_chunk | 86.7099ms | 2.1300ms | 469.4894 Ops/s | 471.1353 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_creation[device0] | 0.2487ms | 0.1191ms | 8.3969 KOps/s | 8.4993 KOps/s | $\color{#d91a1a}-1.20\\%$ | | test_creation_from_tensor | 4.9432ms | 0.1214ms | 8.2350 KOps/s | 8.3531 KOps/s | $\color{#d91a1a}-1.41\\%$ | | test_add_one[memmap_tensor0] | 0.2364ms | 8.3708μs | 119.4627 KOps/s | 126.9046 KOps/s | $\textbf{\color{#d91a1a}-5.86\\%}$ | | test_contiguous[memmap_tensor0] | 30.4270μs | 2.0465μs | 488.6482 KOps/s | 491.5271 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_stack[memmap_tensor0] | 42.3390μs | 5.9479μs | 168.1264 KOps/s | 174.8376 KOps/s | $\color{#d91a1a}-3.84\\%$ | | test_memmaptd_index | 1.0430ms | 0.4117ms | 2.4289 KOps/s | 2.4070 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_memmaptd_index_astensor | 0.9076ms | 0.4935ms | 2.0264 KOps/s | 2.0259 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_memmaptd_index_op | 1.8630ms | 1.0549ms | 947.9876 Ops/s | 943.2386 Ops/s | $\color{#35bf28}+0.50\\%$ | | test_serialize_model | 0.1360s | 0.1294s | 7.7306 Ops/s | 6.8864 Ops/s | $\textbf{\color{#35bf28}+12.26\\%}$ | | test_serialize_model_pickle | 0.4412s | 0.3920s | 2.5513 Ops/s | 2.5117 Ops/s | $\color{#35bf28}+1.58\\%$ | | test_serialize_weights | 0.1376s | 0.1244s | 8.0360 Ops/s | 7.8362 Ops/s | $\color{#35bf28}+2.55\\%$ | | test_serialize_weights_returnearly | 0.1833s | 0.1670s | 5.9895 Ops/s | 5.8761 Ops/s | $\color{#35bf28}+1.93\\%$ | | test_serialize_weights_pickle | 1.2974s | 0.7448s | 1.3427 Ops/s | 2.3441 Ops/s | $\textbf{\color{#d91a1a}-42.72\\%}$ | | test_serialize_weights_filesystem | 0.2378s | 0.1583s | 6.3186 Ops/s | 6.2731 Ops/s | $\color{#35bf28}+0.73\\%$ | | test_serialize_model_filesystem | 0.1589s | 0.1475s | 6.7787 Ops/s | 6.4752 Ops/s | $\color{#35bf28}+4.69\\%$ | | test_reshape_pytree | 98.6940μs | 40.1170μs | 24.9271 KOps/s | 23.9297 KOps/s | $\color{#35bf28}+4.17\\%$ | | test_reshape_td | 95.6790μs | 46.4048μs | 21.5495 KOps/s | 21.4015 KOps/s | $\color{#35bf28}+0.69\\%$ | | test_view_pytree | 0.1256ms | 40.8122μs | 24.5025 KOps/s | 24.5922 KOps/s | $\color{#d91a1a}-0.36\\%$ | | test_view_td | 0.1117ms | 53.5771μs | 18.6647 KOps/s | 19.0338 KOps/s | $\color{#d91a1a}-1.94\\%$ | | test_unbind_pytree | 0.1184ms | 37.3983μs | 26.7392 KOps/s | 26.8649 KOps/s | $\color{#d91a1a}-0.47\\%$ | | test_unbind_td | 0.3712ms | 45.9561μs | 21.7599 KOps/s | 21.7301 KOps/s | $\color{#35bf28}+0.14\\%$ | | test_split_pytree | 85.9700μs | 40.9186μs | 24.4388 KOps/s | 25.2693 KOps/s | $\color{#d91a1a}-3.29\\%$ | | test_split_td | 0.4702ms | 58.4422μs | 17.1109 KOps/s | 16.7449 KOps/s | $\color{#35bf28}+2.19\\%$ | | test_add_pytree | 97.2620μs | 49.4262μs | 20.2322 KOps/s | 20.9234 KOps/s | $\color{#d91a1a}-3.30\\%$ | | test_add_td | 0.1641ms | 84.4518μs | 11.8411 KOps/s | 11.4845 KOps/s | $\color{#35bf28}+3.10\\%$ | | test_compile_add_one_nested[tensordict-compile] | 0.1466ms | 54.3259μs | 18.4074 KOps/s | 19.2223 KOps/s | $\color{#d91a1a}-4.24\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.4639ms | 0.1900ms | 5.2634 KOps/s | 5.3822 KOps/s | $\color{#d91a1a}-2.21\\%$ | | test_compile_add_one_nested[pytree-compile] | 0.1440ms | 55.3241μs | 18.0753 KOps/s | 18.8869 KOps/s | $\color{#d91a1a}-4.30\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.3369ms | 0.1512ms | 6.6134 KOps/s | 6.3910 KOps/s | $\color{#35bf28}+3.48\\%$ | | test_compile_copy_nested[tensordict-compile] | 63.4880μs | 20.2892μs | 49.2872 KOps/s | 50.2824 KOps/s | $\color{#d91a1a}-1.98\\%$ | | test_compile_copy_nested[tensordict-eager] | 0.1428ms | 65.9558μs | 15.1617 KOps/s | 15.5250 KOps/s | $\color{#d91a1a}-2.34\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1569ms | 81.2239μs | 12.3117 KOps/s | 12.5360 KOps/s | $\color{#d91a1a}-1.79\\%$ | | test_compile_copy_nested[pytree-eager] | 0.1397ms | 73.5070μs | 13.6041 KOps/s | 13.6186 KOps/s | $\color{#d91a1a}-0.11\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.3635ms | 0.1752ms | 5.7090 KOps/s | 5.6787 KOps/s | $\color{#35bf28}+0.53\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.4288ms | 0.1928ms | 5.1864 KOps/s | 5.2263 KOps/s | $\color{#d91a1a}-0.76\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.1053ms | 38.7734μs | 25.7909 KOps/s | 26.0493 KOps/s | $\color{#d91a1a}-0.99\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 1.3090ms | 69.9935μs | 14.2870 KOps/s | 14.9603 KOps/s | $\color{#d91a1a}-4.50\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.2458ms | 0.1766ms | 5.6619 KOps/s | 5.6646 KOps/s | $\color{#d91a1a}-0.05\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.5731ms | 0.3043ms | 3.2865 KOps/s | 3.3877 KOps/s | $\color{#d91a1a}-2.99\\%$ | | test_compile_add_self_flat[tensordict-eager] | 0.3955ms | 0.2142ms | 4.6684 KOps/s | 4.7922 KOps/s | $\color{#d91a1a}-2.58\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.3639ms | 0.1781ms | 5.6158 KOps/s | 5.6471 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.7843ms | 63.0057μs | 15.8716 KOps/s | 16.0136 KOps/s | $\color{#d91a1a}-0.89\\%$ | | test_compile_add_self_flat[tensorclass-compile] | 0.1091ms | 40.5903μs | 24.6365 KOps/s | 26.1067 KOps/s | $\textbf{\color{#d91a1a}-5.63\\%}$ | | test_compile_add_self_flat[pytree-eager] | 0.4438ms | 0.2485ms | 4.0247 KOps/s | 4.0741 KOps/s | $\color{#d91a1a}-1.21\\%$ | | test_compile_add_self_flat[pytree-compile] | 0.2852ms | 0.1749ms | 5.7164 KOps/s | 5.6866 KOps/s | $\color{#35bf28}+0.52\\%$ | | test_compile_copy_flat[tensordict-compile] | 0.2379ms | 0.1090ms | 9.1772 KOps/s | 9.1977 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_compile_copy_flat[tensordict-eager] | 0.1290ms | 56.5284μs | 17.6902 KOps/s | 18.1030 KOps/s | $\color{#d91a1a}-2.28\\%$ | | test_compile_copy_flat[pytree-compile] | 0.1493ms | 79.4068μs | 12.5934 KOps/s | 12.4370 KOps/s | $\color{#35bf28}+1.26\\%$ | | test_compile_copy_flat[pytree-eager] | 0.1348ms | 70.3638μs | 14.2118 KOps/s | 13.7334 KOps/s | $\color{#35bf28}+3.48\\%$ | | test_compile_assign_and_add[tensordict-compile] | 0.2734ms | 0.1917ms | 5.2177 KOps/s | 5.1827 KOps/s | $\color{#35bf28}+0.67\\%$ | | test_compile_assign_and_add[tensordict-eager] | 1.9593ms | 1.6445ms | 608.0873 Ops/s | 603.1443 Ops/s | $\color{#35bf28}+0.82\\%$ | | test_compile_assign_and_add[pytree-compile] | 0.2922ms | 0.1880ms | 5.3205 KOps/s | 5.2607 KOps/s | $\color{#35bf28}+1.14\\%$ | | test_compile_assign_and_add[pytree-eager] | 1.4012ms | 1.1061ms | 904.0770 Ops/s | 901.1693 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_compile_assign_and_add_stack[compile] | 0.5324ms | 0.4153ms | 2.4079 KOps/s | 2.3832 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_compile_assign_and_add_stack[eager] | 5.1557ms | 3.8388ms | 260.4960 Ops/s | 255.2031 Ops/s | $\color{#35bf28}+2.07\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 78.8570μs | 32.0882μs | 31.1641 KOps/s | 32.5648 KOps/s | $\color{#d91a1a}-4.30\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 0.6057ms | 49.7573μs | 20.0975 KOps/s | 20.6018 KOps/s | $\color{#d91a1a}-2.45\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 73.1970μs | 28.3985μs | 35.2131 KOps/s | 35.9514 KOps/s | $\color{#d91a1a}-2.05\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 0.1034ms | 30.8843μs | 32.3789 KOps/s | 31.7968 KOps/s | $\color{#35bf28}+1.83\\%$ | | test_compile_indexing[tensor-pytree-compile] | 73.2360μs | 27.8043μs | 35.9657 KOps/s | 36.4497 KOps/s | $\color{#d91a1a}-1.33\\%$ | | test_compile_indexing[tensor-pytree-eager] | 0.1285ms | 30.4527μs | 32.8379 KOps/s | 32.7840 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_compile_indexing[slice-tensordict-compile] | 0.1479ms | 72.0738μs | 13.8747 KOps/s | 14.1599 KOps/s | $\color{#d91a1a}-2.01\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.5490ms | 28.5341μs | 35.0458 KOps/s | 35.3472 KOps/s | $\color{#d91a1a}-0.85\\%$ | | test_compile_indexing[slice-tensorclass-compile] | 0.1425ms | 67.7560μs | 14.7588 KOps/s | 15.0626 KOps/s | $\color{#d91a1a}-2.02\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 68.1970μs | 24.7244μs | 40.4458 KOps/s | 41.6282 KOps/s | $\color{#d91a1a}-2.84\\%$ | | test_compile_indexing[slice-pytree-compile] | 0.1762ms | 67.8437μs | 14.7398 KOps/s | 14.9113 KOps/s | $\color{#d91a1a}-1.15\\%$ | | test_compile_indexing[slice-pytree-eager] | 3.6853ms | 24.7841μs | 40.3485 KOps/s | 41.0984 KOps/s | $\color{#d91a1a}-1.82\\%$ | | test_compile_indexing[int-tensordict-compile] | 0.1536ms | 71.9094μs | 13.9064 KOps/s | 14.2128 KOps/s | $\color{#d91a1a}-2.16\\%$ | | test_compile_indexing[int-tensordict-eager] | 0.6388ms | 28.2230μs | 35.4321 KOps/s | 35.1046 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_compile_indexing[int-tensorclass-compile] | 0.1469ms | 67.4197μs | 14.8325 KOps/s | 15.1063 KOps/s | $\color{#d91a1a}-1.81\\%$ | | test_compile_indexing[int-tensorclass-eager] | 93.7220μs | 24.2668μs | 41.2085 KOps/s | 41.8883 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_compile_indexing[int-pytree-compile] | 0.1585ms | 67.8037μs | 14.7485 KOps/s | 15.0501 KOps/s | $\color{#d91a1a}-2.00\\%$ | | test_compile_indexing[int-pytree-eager] | 65.6630μs | 24.5073μs | 40.8041 KOps/s | 40.4183 KOps/s | $\color{#35bf28}+0.95\\%$ | | test_mod_add[eager] | 75.1200μs | 25.2904μs | 39.5407 KOps/s | 40.5030 KOps/s | $\color{#d91a1a}-2.38\\%$ | | test_mod_add[compile] | 0.1006ms | 36.5376μs | 27.3691 KOps/s | 28.5024 KOps/s | $\color{#d91a1a}-3.98\\%$ | | test_mod_add[compile-overhead] | 91.2200μs | 36.7796μs | 27.1890 KOps/s | 27.7520 KOps/s | $\color{#d91a1a}-2.03\\%$ | | test_mod_wrap[eager] | 0.3141ms | 0.2012ms | 4.9693 KOps/s | 4.8462 KOps/s | $\color{#35bf28}+2.54\\%$ | | test_mod_wrap[compile] | 1.5340ms | 0.2250ms | 4.4443 KOps/s | 4.3268 KOps/s | $\color{#35bf28}+2.72\\%$ | | test_mod_wrap[compile-overhead] | 0.4467ms | 0.2235ms | 4.4750 KOps/s | 4.3884 KOps/s | $\color{#35bf28}+1.97\\%$ | | test_mod_wrap_and_backward[eager] | 12.1663ms | 10.8690ms | 92.0048 Ops/s | 87.5151 Ops/s | $\textbf{\color{#35bf28}+5.13\\%}$ | | test_mod_wrap_and_backward[compile] | 12.2193ms | 10.9854ms | 91.0300 Ops/s | 85.8012 Ops/s | $\textbf{\color{#35bf28}+6.09\\%}$ | | test_mod_wrap_and_backward[compile-overhead] | 12.1795ms | 10.9084ms | 91.6721 Ops/s | 85.4932 Ops/s | $\textbf{\color{#35bf28}+7.23\\%}$ | | test_seq_add[eager] | 0.1730ms | 84.9566μs | 11.7707 KOps/s | 11.3478 KOps/s | $\color{#35bf28}+3.73\\%$ | | test_seq_add[compile] | 0.1570ms | 59.7624μs | 16.7329 KOps/s | 16.8101 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_seq_add[compile-overhead] | 0.1562ms | 59.6049μs | 16.7772 KOps/s | 17.2337 KOps/s | $\color{#d91a1a}-2.65\\%$ | | test_seq_wrap[eager] | 0.5479ms | 0.3663ms | 2.7300 KOps/s | 2.7071 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_seq_wrap[compile] | 0.4047ms | 0.2588ms | 3.8646 KOps/s | 3.8242 KOps/s | $\color{#35bf28}+1.06\\%$ | | test_seq_wrap[compile-overhead] | 0.5016ms | 0.2621ms | 3.8152 KOps/s | 3.8141 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_func_call_runtime[False-eager] | 0.7028ms | 0.5096ms | 1.9622 KOps/s | 1.9666 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_func_call_runtime[False-compile] | 0.8582ms | 0.4929ms | 2.0288 KOps/s | 1.9723 KOps/s | $\color{#35bf28}+2.87\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.8532ms | 0.4913ms | 2.0355 KOps/s | 2.0040 KOps/s | $\color{#35bf28}+1.57\\%$ | | test_func_call_runtime[True-eager] | 1.2966ms | 0.8166ms | 1.2246 KOps/s | 1.2240 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_func_call_runtime[True-compile] | 0.9914ms | 0.5087ms | 1.9660 KOps/s | 1.9310 KOps/s | $\color{#35bf28}+1.81\\%$ | | test_func_call_runtime[True-compile-overhead] | 0.9357ms | 0.5071ms | 1.9721 KOps/s | 1.9055 KOps/s | $\color{#35bf28}+3.49\\%$ | | test_distributed | 0.2797ms | 0.1311ms | 7.6277 KOps/s | 7.4951 KOps/s | $\color{#35bf28}+1.77\\%$ | | test_tdmodule | 82.9650μs | 16.5096μs | 60.5709 KOps/s | 59.0880 KOps/s | $\color{#35bf28}+2.51\\%$ | | test_tdmodule_dispatch | 71.9440μs | 34.8383μs | 28.7041 KOps/s | 27.3812 KOps/s | $\color{#35bf28}+4.83\\%$ | | test_tdseq | 33.1010μs | 18.5916μs | 53.7877 KOps/s | 51.9667 KOps/s | $\color{#35bf28}+3.50\\%$ | | test_tdseq_dispatch | 70.2600μs | 38.3816μs | 26.0542 KOps/s | 24.2186 KOps/s | $\textbf{\color{#35bf28}+7.58\\%}$ | | test_instantiation_functorch | 1.8216ms | 1.6342ms | 611.9388 Ops/s | 603.4810 Ops/s | $\color{#35bf28}+1.40\\%$ | | test_instantiation_td | 1.8577ms | 1.1801ms | 847.3721 Ops/s | 844.1625 Ops/s | $\color{#35bf28}+0.38\\%$ | | test_exec_functorch | 0.3226ms | 0.1778ms | 5.6247 KOps/s | 5.7156 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_exec_functional_call | 0.3279ms | 0.1670ms | 5.9884 KOps/s | 6.0262 KOps/s | $\color{#d91a1a}-0.63\\%$ | | test_exec_td | 0.4174ms | 0.1674ms | 5.9733 KOps/s | 5.9913 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_exec_td_decorator | 1.1150ms | 0.2547ms | 3.9268 KOps/s | 4.0282 KOps/s | $\color{#d91a1a}-2.52\\%$ | | test_vmap_mlp_speed[True-True] | 0.9030ms | 0.5900ms | 1.6949 KOps/s | 1.6848 KOps/s | $\color{#35bf28}+0.60\\%$ | | test_vmap_mlp_speed[True-False] | 0.8654ms | 0.5868ms | 1.7041 KOps/s | 1.7073 KOps/s | $\color{#d91a1a}-0.19\\%$ | | test_vmap_mlp_speed[False-True] | 0.7752ms | 0.4836ms | 2.0680 KOps/s | 2.0768 KOps/s | $\color{#d91a1a}-0.42\\%$ | | test_vmap_mlp_speed[False-False] | 1.0224ms | 0.4847ms | 2.0633 KOps/s | 2.0642 KOps/s | $\color{#d91a1a}-0.04\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.4381ms | 0.6858ms | 1.4582 KOps/s | 1.4604 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.9856ms | 0.6809ms | 1.4687 KOps/s | 1.4718 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8975ms | 0.5695ms | 1.7560 KOps/s | 1.7665 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8080ms | 0.5619ms | 1.7797 KOps/s | 1.7774 KOps/s | $\color{#35bf28}+0.13\\%$ | | test_to_module_speed[True] | 2.5569ms | 1.7886ms | 559.0824 Ops/s | 541.0632 Ops/s | $\color{#35bf28}+3.33\\%$ | | test_to_module_speed[False] | 2.3988ms | 1.7473ms | 572.3114 Ops/s | 555.4801 Ops/s | $\color{#35bf28}+3.03\\%$ | | test_tc_init | 0.1092ms | 43.6874μs | 22.8899 KOps/s | 21.1356 KOps/s | $\textbf{\color{#35bf28}+8.30\\%}$ | | test_tc_init_nested | 0.1538ms | 88.3932μs | 11.3131 KOps/s | 10.7235 KOps/s | $\textbf{\color{#35bf28}+5.50\\%}$ | | test_tc_first_layer_tensor | 25.6880μs | 1.4629μs | 683.5740 KOps/s | 681.1701 KOps/s | $\color{#35bf28}+0.35\\%$ | | test_tc_first_layer_nontensor | 34.4540μs | 4.2889μs | 233.1598 KOps/s | 232.3402 KOps/s | $\color{#35bf28}+0.35\\%$ | | test_tc_second_layer_tensor | 52.6180μs | 2.6805μs | 373.0644 KOps/s | 359.9128 KOps/s | $\color{#35bf28}+3.65\\%$ | | test_tc_second_layer_nontensor | 31.6790μs | 5.4989μs | 181.8545 KOps/s | 178.9318 KOps/s | $\color{#35bf28}+1.63\\%$ | | test_unbind | 0.4488s | 13.9811ms | 71.5252 Ops/s | 76.4438 Ops/s | $\textbf{\color{#d91a1a}-6.43\\%}$ | | test_full_like | 9.1341ms | 7.6510ms | 130.7026 Ops/s | 135.4509 Ops/s | $\color{#d91a1a}-3.51\\%$ | | test_zeros_like | 3.6028ms | 2.9880ms | 334.6757 Ops/s | 158.8780 Ops/s | $\textbf{\color{#35bf28}+110.65\\%}$ | | test_ones_like | 3.8305ms | 3.4338ms | 291.2208 Ops/s | 134.1630 Ops/s | $\textbf{\color{#35bf28}+117.06\\%}$ | | test_clone | 6.1601ms | 5.3875ms | 185.6146 Ops/s | 108.0878 Ops/s | $\textbf{\color{#35bf28}+71.73\\%}$ | | test_squeeze | 66.5440μs | 13.2488μs | 75.4785 KOps/s | 74.2152 KOps/s | $\color{#35bf28}+1.70\\%$ | | test_unsqueeze | 0.1947ms | 94.3008μs | 10.6044 KOps/s | 10.8408 KOps/s | $\color{#d91a1a}-2.18\\%$ | | test_split | 0.5164ms | 0.2028ms | 4.9307 KOps/s | 5.0158 KOps/s | $\color{#d91a1a}-1.70\\%$ | | test_permute | 0.3574ms | 0.2199ms | 4.5479 KOps/s | 4.5297 KOps/s | $\color{#35bf28}+0.40\\%$ | | test_stack | 28.6270ms | 25.5388ms | 39.1562 Ops/s | 38.8821 Ops/s | $\color{#35bf28}+0.70\\%$ | | test_cat | 28.9697ms | 25.2547ms | 39.5966 Ops/s | 39.8943 Ops/s | $\color{#d91a1a}-0.75\\%$ |
github-actions[bot]
commented
3 months ago
$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests
Total Benchmarks: 219. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}19$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | -------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 0.1551ms | 16.7218μs | 59.8022 KOps/s | 62.1944 KOps/s | $\color{#d91a1a}-3.85\\%$ | | test_plain_set_stack_nested | 36.4300μs | 16.9582μs | 58.9685 KOps/s | 62.6136 KOps/s | $\textbf{\color{#d91a1a}-5.82\\%}$ | | test_plain_set_nested_inplace | 37.7910μs | 18.0042μs | 55.5427 KOps/s | 58.3109 KOps/s | $\color{#d91a1a}-4.75\\%$ | | test_plain_set_stack_nested_inplace | 40.4310μs | 17.8369μs | 56.0636 KOps/s | 59.2201 KOps/s | $\textbf{\color{#d91a1a}-5.33\\%}$ | | test_items | 26.1400μs | 4.5980μs | 217.4874 KOps/s | 215.7543 KOps/s | $\color{#35bf28}+0.80\\%$ | | test_items_nested | 0.4148ms | 0.3613ms | 2.7674 KOps/s | 2.7382 KOps/s | $\color{#35bf28}+1.07\\%$ | | test_items_nested_locked | 0.5310ms | 0.3690ms | 2.7098 KOps/s | 2.7212 KOps/s | $\color{#d91a1a}-0.42\\%$ | | test_items_nested_leaf | 0.1103ms | 84.2938μs | 11.8633 KOps/s | 11.9728 KOps/s | $\color{#d91a1a}-0.91\\%$ | | test_items_stack_nested | 0.4120ms | 0.3643ms | 2.7451 KOps/s | 2.6770 KOps/s | $\color{#35bf28}+2.55\\%$ | | test_items_stack_nested_leaf | 0.1074ms | 84.3118μs | 11.8607 KOps/s | 11.8106 KOps/s | $\color{#35bf28}+0.42\\%$ | | test_items_stack_nested_locked | 0.4383ms | 0.3671ms | 2.7242 KOps/s | 2.7097 KOps/s | $\color{#35bf28}+0.53\\%$ | | test_keys | 19.2800μs | 4.3485μs | 229.9628 KOps/s | 207.1721 KOps/s | $\textbf{\color{#35bf28}+11.00\\%}$ | | test_keys_nested | 96.1820μs | 67.2044μs | 14.8800 KOps/s | 15.2589 KOps/s | $\color{#d91a1a}-2.48\\%$ | | test_keys_nested_locked | 0.7563ms | 72.8657μs | 13.7239 KOps/s | 13.8879 KOps/s | $\color{#d91a1a}-1.18\\%$ | | test_keys_nested_leaf | 80.8710μs | 56.9937μs | 17.5458 KOps/s | 17.2749 KOps/s | $\color{#35bf28}+1.57\\%$ | | test_keys_stack_nested | 89.1620μs | 66.5677μs | 15.0223 KOps/s | 14.8487 KOps/s | $\color{#35bf28}+1.17\\%$ | | test_keys_stack_nested_leaf | 73.4010μs | 57.6344μs | 17.3508 KOps/s | 17.7176 KOps/s | $\color{#d91a1a}-2.07\\%$ | | test_keys_stack_nested_locked | 98.3720μs | 71.6289μs | 13.9609 KOps/s | 13.7650 KOps/s | $\color{#35bf28}+1.42\\%$ | | test_values | 8.8603μs | 1.7688μs | 565.3609 KOps/s | 568.8456 KOps/s | $\color{#d91a1a}-0.61\\%$ | | test_values_nested | 50.0910μs | 33.8393μs | 29.5515 KOps/s | 29.5891 KOps/s | $\color{#d91a1a}-0.13\\%$ | | test_values_nested_locked | 64.3410μs | 35.5983μs | 28.0912 KOps/s | 27.8516 KOps/s | $\color{#35bf28}+0.86\\%$ | | test_values_nested_leaf | 45.3800μs | 29.9675μs | 33.3695 KOps/s | 32.9063 KOps/s | $\color{#35bf28}+1.41\\%$ | | test_values_stack_nested | 81.9020μs | 34.4286μs | 29.0456 KOps/s | 28.9414 KOps/s | $\color{#35bf28}+0.36\\%$ | | test_values_stack_nested_leaf | 54.6100μs | 30.7867μs | 32.4816 KOps/s | 32.2770 KOps/s | $\color{#35bf28}+0.63\\%$ | | test_values_stack_nested_locked | 56.7510μs | 36.2364μs | 27.5966 KOps/s | 27.5376 KOps/s | $\color{#35bf28}+0.21\\%$ | | test_membership | 1.5020μs | 0.5547μs | 1.8027 MOps/s | 1.8020 MOps/s | $\color{#35bf28}+0.04\\%$ | | test_membership_nested | 16.8410μs | 1.9538μs | 511.8346 KOps/s | 517.5518 KOps/s | $\color{#d91a1a}-1.10\\%$ | | test_membership_nested_leaf | 10.0450μs | 1.9115μs | 523.1438 KOps/s | 517.1357 KOps/s | $\color{#35bf28}+1.16\\%$ | | test_membership_stacked_nested | 23.7810μs | 1.9724μs | 507.0037 KOps/s | 502.9258 KOps/s | $\color{#35bf28}+0.81\\%$ | | test_membership_stacked_nested_leaf | 20.0500μs | 1.9433μs | 514.5994 KOps/s | 498.6715 KOps/s | $\color{#35bf28}+3.19\\%$ | | test_membership_nested_last | 17.9410μs | 2.8678μs | 348.6947 KOps/s | 349.5427 KOps/s | $\color{#d91a1a}-0.24\\%$ | | test_membership_nested_leaf_last | 21.2500μs | 2.8824μs | 346.9317 KOps/s | 350.7807 KOps/s | $\color{#d91a1a}-1.10\\%$ | | test_membership_stacked_nested_last | 32.6710μs | 9.0601μs | 110.3739 KOps/s | 347.7915 KOps/s | $\textbf{\color{#d91a1a}-68.26\\%}$ | | test_membership_stacked_nested_leaf_last | 31.0300μs | 9.0455μs | 110.5523 KOps/s | 353.7228 KOps/s | $\textbf{\color{#d91a1a}-68.75\\%}$ | | test_nested_getleaf | 24.9300μs | 7.8955μs | 126.6550 KOps/s | 126.0875 KOps/s | $\color{#35bf28}+0.45\\%$ | | test_nested_get | 30.0010μs | 7.4729μs | 133.8173 KOps/s | 134.9984 KOps/s | $\color{#d91a1a}-0.87\\%$ | | test_stacked_getleaf | 25.6300μs | 7.9664μs | 125.5270 KOps/s | 126.2220 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_stacked_get | 20.8910μs | 7.4534μs | 134.1663 KOps/s | 134.2228 KOps/s | $\color{#d91a1a}-0.04\\%$ | | test_nested_getitemleaf | 23.9310μs | 8.0625μs | 124.0305 KOps/s | 124.1710 KOps/s | $\color{#d91a1a}-0.11\\%$ | | test_nested_getitem | 22.5300μs | 7.6041μs | 131.5087 KOps/s | 132.4049 KOps/s | $\color{#d91a1a}-0.68\\%$ | | test_stacked_getitemleaf | 23.2510μs | 8.1120μs | 123.2736 KOps/s | 124.0051 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_stacked_getitem | 23.2100μs | 7.6251μs | 131.1459 KOps/s | 132.2297 KOps/s | $\color{#d91a1a}-0.82\\%$ | | test_lock_nested | 9.9804ms | 0.4741ms | 2.1093 KOps/s | 2.1078 KOps/s | $\color{#35bf28}+0.07\\%$ | | test_lock_stack_nested | 0.4753ms | 0.4199ms | 2.3816 KOps/s | 2.2728 KOps/s | $\color{#35bf28}+4.79\\%$ | | test_unlock_nested | 0.8844ms | 0.3842ms | 2.6029 KOps/s | 2.5183 KOps/s | $\color{#35bf28}+3.36\\%$ | | test_unlock_stack_nested | 0.4109ms | 0.3377ms | 2.9614 KOps/s | 2.7756 KOps/s | $\textbf{\color{#35bf28}+6.69\\%}$ | | test_flatten_speed | 0.4863ms | 0.1044ms | 9.5751 KOps/s | 9.6689 KOps/s | $\color{#d91a1a}-0.97\\%$ | | test_unflatten_speed | 0.3260ms | 0.2885ms | 3.4658 KOps/s | 3.4756 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_common_ops | 1.5293ms | 1.2879ms | 776.4681 Ops/s | 784.9529 Ops/s | $\color{#d91a1a}-1.08\\%$ | | test_creation | 16.6900μs | 1.6198μs | 617.3540 KOps/s | 607.1372 KOps/s | $\color{#35bf28}+1.68\\%$ | | test_creation_empty | 38.9600μs | 16.6635μs | 60.0114 KOps/s | 65.9094 KOps/s | $\textbf{\color{#d91a1a}-8.95\\%}$ | | test_creation_nested_1 | 40.7100μs | 18.6111μs | 53.7313 KOps/s | 58.9350 KOps/s | $\textbf{\color{#d91a1a}-8.83\\%}$ | | test_creation_nested_2 | 54.0610μs | 21.1410μs | 47.3015 KOps/s | 50.6274 KOps/s | $\textbf{\color{#d91a1a}-6.57\\%}$ | | test_clone | 56.7510μs | 29.2682μs | 34.1667 KOps/s | 32.1618 KOps/s | $\textbf{\color{#35bf28}+6.23\\%}$ | | test_getitem[int] | 1.1221ms | 16.6602μs | 60.0233 KOps/s | 58.1413 KOps/s | $\color{#35bf28}+3.24\\%$ | | test_getitem[slice_int] | 0.1588ms | 28.2825μs | 35.3576 KOps/s | 33.5627 KOps/s | $\textbf{\color{#35bf28}+5.35\\%}$ | | test_getitem[range] | 0.2362ms | 0.1148ms | 8.7112 KOps/s | 8.5156 KOps/s | $\color{#35bf28}+2.30\\%$ | | test_getitem[tuple] | 91.0976ms | 31.0816μs | 32.1734 KOps/s | 39.5523 KOps/s | $\textbf{\color{#d91a1a}-18.66\\%}$ | | test_getitem[list] | 0.2152ms | 0.1056ms | 9.4740 KOps/s | 9.3939 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_setitem_dim[int] | 83.5120μs | 52.8385μs | 18.9256 KOps/s | 18.9036 KOps/s | $\color{#35bf28}+0.12\\%$ | | test_setitem_dim[slice_int] | 0.1241ms | 77.5009μs | 12.9031 KOps/s | 13.0038 KOps/s | $\color{#d91a1a}-0.77\\%$ | | test_setitem_dim[range] | 0.1764ms | 0.1424ms | 7.0240 KOps/s | 6.9123 KOps/s | $\color{#35bf28}+1.62\\%$ | | test_setitem_dim[tuple] | 96.0320μs | 70.4988μs | 14.1846 KOps/s | 14.3539 KOps/s | $\color{#d91a1a}-1.18\\%$ | | test_setitem | 79.8410μs | 42.7447μs | 23.3947 KOps/s | 22.9241 KOps/s | $\color{#35bf28}+2.05\\%$ | | test_set | 86.4010μs | 41.9909μs | 23.8147 KOps/s | 23.3704 KOps/s | $\color{#35bf28}+1.90\\%$ | | test_set_shared | 0.3721ms | 53.0707μs | 18.8428 KOps/s | 18.3674 KOps/s | $\color{#35bf28}+2.59\\%$ | | test_update | 90.7020μs | 51.1065μs | 19.5670 KOps/s | 19.8687 KOps/s | $\color{#d91a1a}-1.52\\%$ | | test_update_nested | 96.3310μs | 58.1163μs | 17.2069 KOps/s | 17.3641 KOps/s | $\color{#d91a1a}-0.91\\%$ | | test_update__nested | 0.1166ms | 60.1426μs | 16.6272 KOps/s | 15.8675 KOps/s | $\color{#35bf28}+4.79\\%$ | | test_set_nested | 74.4410μs | 44.1515μs | 22.6493 KOps/s | 22.5500 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_set_nested_new | 0.5473ms | 47.7510μs | 20.9420 KOps/s | 20.8711 KOps/s | $\color{#35bf28}+0.34\\%$ | | test_select | 0.1044ms | 61.9463μs | 16.1430 KOps/s | 15.9293 KOps/s | $\color{#35bf28}+1.34\\%$ | | test_select_nested | 0.1267ms | 51.5837μs | 19.3860 KOps/s | 19.8604 KOps/s | $\color{#d91a1a}-2.39\\%$ | | test_exclude_nested | 93.6120μs | 68.2738μs | 14.6469 KOps/s | 14.5136 KOps/s | $\color{#35bf28}+0.92\\%$ | | test_empty[True] | 0.3797ms | 0.2795ms | 3.5783 KOps/s | 3.5234 KOps/s | $\color{#35bf28}+1.56\\%$ | | test_empty[False] | 1.6510μs | 0.8578μs | 1.1658 MOps/s | 1.1514 MOps/s | $\color{#35bf28}+1.25\\%$ | | test_to | 66.3310μs | 36.7096μs | 27.2408 KOps/s | 27.0848 KOps/s | $\color{#35bf28}+0.58\\%$ | | test_to_nonblocking | 44.6710μs | 23.0794μs | 43.3287 KOps/s | 44.1680 KOps/s | $\color{#d91a1a}-1.90\\%$ | | test_unbind_speed | 0.6918ms | 0.2944ms | 3.3971 KOps/s | 3.2934 KOps/s | $\color{#35bf28}+3.15\\%$ | | test_unbind_speed_stack0 | 0.3424ms | 0.2882ms | 3.4692 KOps/s | 3.3038 KOps/s | $\textbf{\color{#35bf28}+5.01\\%}$ | | test_unbind_speed_stack1 | 0.7357ms | 0.6836ms | 1.4629 KOps/s | 1.2902 KOps/s | $\textbf{\color{#35bf28}+13.38\\%}$ | | test_split | 92.4889ms | 2.3098ms | 432.9313 Ops/s | 427.4994 Ops/s | $\color{#35bf28}+1.27\\%$ | | test_chunk | 92.7059ms | 2.3280ms | 429.5612 Ops/s | 424.4583 Ops/s | $\color{#35bf28}+1.20\\%$ | | test_creation[device0] | 0.1569ms | 0.1046ms | 9.5630 KOps/s | 9.2673 KOps/s | $\color{#35bf28}+3.19\\%$ | | test_creation_from_tensor | 0.2514ms | 0.1015ms | 9.8510 KOps/s | 9.8593 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_add_one[memmap_tensor0] | 66.8710μs | 9.1199μs | 109.6500 KOps/s | 103.7806 KOps/s | $\textbf{\color{#35bf28}+5.66\\%}$ | | test_contiguous[memmap_tensor0] | 19.5410μs | 2.1482μs | 465.5011 KOps/s | 460.3787 KOps/s | $\color{#35bf28}+1.11\\%$ | | test_stack[memmap_tensor0] | 25.3710μs | 6.4154μs | 155.8745 KOps/s | 148.3442 KOps/s | $\textbf{\color{#35bf28}+5.08\\%}$ | | test_memmaptd_index | 1.1425ms | 0.4254ms | 2.3505 KOps/s | 2.2928 KOps/s | $\color{#35bf28}+2.52\\%$ | | test_memmaptd_index_astensor | 0.8717ms | 0.4876ms | 2.0507 KOps/s | 2.0005 KOps/s | $\color{#35bf28}+2.50\\%$ | | test_memmaptd_index_op | 1.4922ms | 1.0455ms | 956.5008 Ops/s | 951.4856 Ops/s | $\color{#35bf28}+0.53\\%$ | | test_serialize_model | 0.1932s | 0.1078s | 9.2775 Ops/s | 10.1826 Ops/s | $\textbf{\color{#d91a1a}-8.89\\%}$ | | test_serialize_model_pickle | 1.3495s | 1.2370s | 0.8084 Ops/s | 0.8078 Ops/s | $\color{#35bf28}+0.07\\%$ | | test_serialize_weights | 95.8894ms | 92.4664ms | 10.8147 Ops/s | 9.3590 Ops/s | $\textbf{\color{#35bf28}+15.55\\%}$ | | test_serialize_weights_returnearly | 0.2769s | 86.1648ms | 11.6057 Ops/s | 11.5450 Ops/s | $\color{#35bf28}+0.53\\%$ | | test_serialize_weights_pickle | 1.3502s | 1.2366s | 0.8087 Ops/s | 0.8083 Ops/s | $\color{#35bf28}+0.04\\%$ | | test_reshape_pytree | 65.4610μs | 37.4809μs | 26.6803 KOps/s | 25.9869 KOps/s | $\color{#35bf28}+2.67\\%$ | | test_reshape_td | 72.4510μs | 43.6641μs | 22.9021 KOps/s | 22.8953 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_view_pytree | 70.4310μs | 36.7150μs | 27.2368 KOps/s | 26.5113 KOps/s | $\color{#35bf28}+2.74\\%$ | | test_view_td | 84.8110μs | 48.9427μs | 20.4321 KOps/s | 20.2002 KOps/s | $\color{#35bf28}+1.15\\%$ | | test_unbind_pytree | 71.2310μs | 36.4789μs | 27.4131 KOps/s | 26.7075 KOps/s | $\color{#35bf28}+2.64\\%$ | | test_unbind_td | 0.4415ms | 45.0157μs | 22.2145 KOps/s | 21.8709 KOps/s | $\color{#35bf28}+1.57\\%$ | | test_split_pytree | 80.3910μs | 49.6750μs | 20.1309 KOps/s | 19.3095 KOps/s | $\color{#35bf28}+4.25\\%$ | | test_split_td | 0.1961ms | 57.7314μs | 17.3216 KOps/s | 15.7819 KOps/s | $\textbf{\color{#35bf28}+9.76\\%}$ | | test_add_pytree | 0.1496ms | 59.2850μs | 16.8677 KOps/s | 16.3521 KOps/s | $\color{#35bf28}+3.15\\%$ | | test_add_td | 0.2923ms | 96.1776μs | 10.3974 KOps/s | 10.9434 KOps/s | $\color{#d91a1a}-4.99\\%$ | | test_compile_add_one_nested[tensordict-compile] | 0.4192ms | 0.2102ms | 4.7580 KOps/s | 4.6541 KOps/s | $\color{#35bf28}+2.23\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.2609ms | 0.1733ms | 5.7698 KOps/s | 5.8203 KOps/s | $\color{#d91a1a}-0.87\\%$ | | test_compile_add_one_nested[pytree-compile] | 0.1917ms | 0.1451ms | 6.8913 KOps/s | 6.8185 KOps/s | $\color{#35bf28}+1.07\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.2515ms | 0.1945ms | 5.1420 KOps/s | 5.0520 KOps/s | $\color{#35bf28}+1.78\\%$ | | test_compile_copy_nested[tensordict-compile] | 45.4700μs | 22.3056μs | 44.8317 KOps/s | 44.5926 KOps/s | $\color{#35bf28}+0.54\\%$ | | test_compile_copy_nested[tensordict-eager] | 76.8210μs | 48.6001μs | 20.5761 KOps/s | 20.6307 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1119ms | 72.9934μs | 13.6999 KOps/s | 13.7686 KOps/s | $\color{#d91a1a}-0.50\\%$ | | test_compile_copy_nested[pytree-eager] | 93.1310μs | 60.0474μs | 16.6535 KOps/s | 16.8233 KOps/s | $\color{#d91a1a}-1.01\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.3934ms | 0.3260ms | 3.0676 KOps/s | 3.0402 KOps/s | $\color{#35bf28}+0.90\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.3203ms | 0.2254ms | 4.4358 KOps/s | 4.4876 KOps/s | $\color{#d91a1a}-1.15\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.1684ms | 0.1301ms | 7.6861 KOps/s | 7.6662 KOps/s | $\color{#35bf28}+0.26\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 0.1324ms | 62.1774μs | 16.0830 KOps/s | 16.0876 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.3863ms | 0.3241ms | 3.0853 KOps/s | 3.0427 KOps/s | $\color{#35bf28}+1.40\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.7805ms | 0.6371ms | 1.5696 KOps/s | 1.5125 KOps/s | $\color{#35bf28}+3.78\\%$ | | test_compile_add_self_flat[tensordict-eager] | 0.3816ms | 0.2741ms | 3.6482 KOps/s | 3.6631 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.5808ms | 0.3282ms | 3.0467 KOps/s | 3.0208 KOps/s | $\color{#35bf28}+0.86\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.2069ms | 73.6240μs | 13.5825 KOps/s | 13.0110 KOps/s | $\color{#35bf28}+4.39\\%$ | | test_compile_add_self_flat[tensorclass-compile] | 0.2021ms | 0.1299ms | 7.6986 KOps/s | 7.5713 KOps/s | $\color{#35bf28}+1.68\\%$ | | test_compile_add_self_flat[pytree-eager] | 0.5996ms | 0.5343ms | 1.8715 KOps/s | 1.7553 KOps/s | $\textbf{\color{#35bf28}+6.62\\%}$ | | test_compile_add_self_flat[pytree-compile] | 0.3792ms | 0.3243ms | 3.0835 KOps/s | 3.0502 KOps/s | $\color{#35bf28}+1.09\\%$ | | test_compile_copy_flat[tensordict-compile] | 41.8600μs | 18.5808μs | 53.8189 KOps/s | 52.8709 KOps/s | $\color{#35bf28}+1.79\\%$ | | test_compile_copy_flat[tensordict-eager] | 54.9400μs | 31.8196μs | 31.4272 KOps/s | 30.2901 KOps/s | $\color{#35bf28}+3.75\\%$ | | test_compile_copy_flat[pytree-compile] | 0.1113ms | 76.4919μs | 13.0733 KOps/s | 13.0455 KOps/s | $\color{#35bf28}+0.21\\%$ | | test_compile_copy_flat[pytree-eager] | 86.7720μs | 60.6406μs | 16.4906 KOps/s | 16.4789 KOps/s | $\color{#35bf28}+0.07\\%$ | | test_compile_assign_and_add[tensordict-compile] | 2.5100ms | 0.9230ms | 1.0835 KOps/s | 1.0682 KOps/s | $\color{#35bf28}+1.43\\%$ | | test_compile_assign_and_add[tensordict-eager] | 3.7062ms | 3.3975ms | 294.3333 Ops/s | 295.1507 Ops/s | $\color{#d91a1a}-0.28\\%$ | | test_compile_assign_and_add[pytree-compile] | 2.4914ms | 0.9116ms | 1.0969 KOps/s | 1.0903 KOps/s | $\color{#35bf28}+0.61\\%$ | | test_compile_assign_and_add[pytree-eager] | 3.4983ms | 3.2782ms | 305.0418 Ops/s | 290.5628 Ops/s | $\color{#35bf28}+4.98\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 0.1624ms | 0.1097ms | 9.1120 KOps/s | 8.7826 KOps/s | $\color{#35bf28}+3.75\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 0.2378ms | 65.7724μs | 15.2039 KOps/s | 15.6505 KOps/s | $\color{#d91a1a}-2.85\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 0.1426ms | 0.1030ms | 9.7110 KOps/s | 9.3607 KOps/s | $\color{#35bf28}+3.74\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 82.6020μs | 46.9416μs | 21.3031 KOps/s | 21.2689 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_compile_indexing[tensor-pytree-compile] | 0.1392ms | 0.1032ms | 9.6936 KOps/s | 9.3524 KOps/s | $\color{#35bf28}+3.65\\%$ | | test_compile_indexing[tensor-pytree-eager] | 84.9020μs | 48.2806μs | 20.7122 KOps/s | 21.4462 KOps/s | $\color{#d91a1a}-3.42\\%$ | | test_compile_indexing[slice-tensordict-compile] | 0.1785ms | 0.1374ms | 7.2755 KOps/s | 7.2041 KOps/s | $\color{#35bf28}+0.99\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.1923ms | 27.1292μs | 36.8606 KOps/s | 37.0918 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_compile_indexing[slice-tensorclass-compile] | 0.1649ms | 0.1296ms | 7.7184 KOps/s | 7.5789 KOps/s | $\color{#35bf28}+1.84\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 52.2500μs | 23.9323μs | 41.7846 KOps/s | 43.3921 KOps/s | $\color{#d91a1a}-3.70\\%$ | | test_compile_indexing[slice-pytree-compile] | 0.2960ms | 0.1353ms | 7.3903 KOps/s | 7.5943 KOps/s | $\color{#d91a1a}-2.69\\%$ | | test_compile_indexing[slice-pytree-eager] | 46.4210μs | 24.1417μs | 41.4221 KOps/s | 44.3260 KOps/s | $\textbf{\color{#d91a1a}-6.55\\%}$ | | test_compile_indexing[int-tensordict-compile] | 0.1776ms | 0.1428ms | 7.0031 KOps/s | 7.1808 KOps/s | $\color{#d91a1a}-2.47\\%$ | | test_compile_indexing[int-tensordict-eager] | 0.4830ms | 27.3580μs | 36.5524 KOps/s | 38.1469 KOps/s | $\color{#d91a1a}-4.18\\%$ | | test_compile_indexing[int-tensorclass-compile] | 0.1701ms | 0.1348ms | 7.4177 KOps/s | 7.5952 KOps/s | $\color{#d91a1a}-2.34\\%$ | | test_compile_indexing[int-tensorclass-eager] | 51.7810μs | 24.2743μs | 41.1958 KOps/s | 44.2246 KOps/s | $\textbf{\color{#d91a1a}-6.85\\%}$ | | test_compile_indexing[int-pytree-compile] | 0.1602ms | 0.1293ms | 7.7359 KOps/s | 7.5793 KOps/s | $\color{#35bf28}+2.07\\%$ | | test_compile_indexing[int-pytree-eager] | 51.2510μs | 23.9169μs | 41.8115 KOps/s | 44.4262 KOps/s | $\textbf{\color{#d91a1a}-5.89\\%}$ | | test_mod_add[eager] | 75.0210μs | 37.7154μs | 26.5144 KOps/s | 27.6604 KOps/s | $\color{#d91a1a}-4.14\\%$ | | test_mod_add[compile] | 0.2294ms | 71.4118μs | 14.0033 KOps/s | 14.4219 KOps/s | $\color{#d91a1a}-2.90\\%$ | | test_mod_add[compile-overhead] | 0.2685ms | 0.1456ms | 6.8684 KOps/s | 6.7337 KOps/s | $\color{#35bf28}+2.00\\%$ | | test_mod_wrap[eager] | 0.3403ms | 0.2490ms | 4.0164 KOps/s | 3.7752 KOps/s | $\textbf{\color{#35bf28}+6.39\\%}$ | | test_mod_wrap[compile] | 0.3314ms | 0.2893ms | 3.4565 KOps/s | 3.4504 KOps/s | $\color{#35bf28}+0.18\\%$ | | test_mod_wrap[compile-overhead] | 8.3391ms | 4.3377ms | 230.5389 Ops/s | 228.8952 Ops/s | $\color{#35bf28}+0.72\\%$ | | test_mod_wrap_and_backward[eager] | 1.5868ms | 1.4669ms | 681.7084 Ops/s | 722.2958 Ops/s | $\textbf{\color{#d91a1a}-5.62\\%}$ | | test_mod_wrap_and_backward[compile] | 1.6184ms | 1.4360ms | 696.3841 Ops/s | 692.4943 Ops/s | $\color{#35bf28}+0.56\\%$ | | test_mod_wrap_and_backward[compile-overhead] | 1.4356ms | 0.9905ms | 1.0096 KOps/s | 990.9817 Ops/s | $\color{#35bf28}+1.88\\%$ | | test_seq_add[eager] | 0.1575ms | 0.1097ms | 9.1145 KOps/s | 9.2733 KOps/s | $\color{#d91a1a}-1.71\\%$ | | test_seq_add[compile] | 0.1255ms | 84.2123μs | 11.8747 KOps/s | 11.6630 KOps/s | $\color{#35bf28}+1.82\\%$ | | test_seq_add[compile-overhead] | 0.1580ms | 0.1204ms | 8.3077 KOps/s | 8.1224 KOps/s | $\color{#35bf28}+2.28\\%$ | | test_seq_wrap[eager] | 0.4867ms | 0.4162ms | 2.4025 KOps/s | 2.4097 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_seq_wrap[compile] | 0.3769ms | 0.3185ms | 3.1396 KOps/s | 3.0105 KOps/s | $\color{#35bf28}+4.29\\%$ | | test_seq_wrap[compile-overhead] | 0.3109s | 0.1490s | 6.7102 Ops/s | 6.6787 Ops/s | $\color{#35bf28}+0.47\\%$ | | test_func_call_runtime[False-eager] | 0.7976ms | 0.7409ms | 1.3497 KOps/s | 1.3259 KOps/s | $\color{#35bf28}+1.80\\%$ | | test_func_call_runtime[False-compile] | 0.8676ms | 0.7950ms | 1.2579 KOps/s | 1.2517 KOps/s | $\color{#35bf28}+0.49\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.4174ms | 0.3618ms | 2.7636 KOps/s | 2.7330 KOps/s | $\color{#35bf28}+1.12\\%$ | | test_func_call_runtime[True-eager] | 1.0780ms | 0.9932ms | 1.0068 KOps/s | 1.0067 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_func_call_runtime[True-compile] | 0.8924ms | 0.8327ms | 1.2009 KOps/s | 1.1815 KOps/s | $\color{#35bf28}+1.64\\%$ | | test_func_call_runtime[True-compile-overhead] | 0.4828ms | 0.4028ms | 2.4825 KOps/s | 2.4466 KOps/s | $\color{#35bf28}+1.47\\%$ | | test_distributed | 1.8402ms | 73.5981μs | 13.5873 KOps/s | 14.2437 KOps/s | $\color{#d91a1a}-4.61\\%$ | | test_tdmodule | 31.4310μs | 15.6992μs | 63.6976 KOps/s | 68.0991 KOps/s | $\textbf{\color{#d91a1a}-6.46\\%}$ | | test_tdmodule_dispatch | 52.4410μs | 32.4538μs | 30.8130 KOps/s | 33.0974 KOps/s | $\textbf{\color{#d91a1a}-6.90\\%}$ | | test_tdseq | 33.1900μs | 16.3664μs | 61.1007 KOps/s | 63.3233 KOps/s | $\color{#d91a1a}-3.51\\%$ | | test_tdseq_dispatch | 52.6310μs | 34.0878μs | 29.3360 KOps/s | 29.7757 KOps/s | $\color{#d91a1a}-1.48\\%$ | | test_instantiation_functorch | 2.1287ms | 1.9932ms | 501.7050 Ops/s | 496.2559 Ops/s | $\color{#35bf28}+1.10\\%$ | | test_instantiation_td | 1.9714ms | 1.2945ms | 772.4886 Ops/s | 768.3447 Ops/s | $\color{#35bf28}+0.54\\%$ | | test_exec_functorch | 0.2844ms | 0.2231ms | 4.4826 KOps/s | 4.4206 KOps/s | $\color{#35bf28}+1.40\\%$ | | test_exec_functional_call | 0.2580ms | 0.2183ms | 4.5811 KOps/s | 4.2540 KOps/s | $\textbf{\color{#35bf28}+7.69\\%}$ | | test_exec_td | 0.2424ms | 0.2159ms | 4.6316 KOps/s | 4.2714 KOps/s | $\textbf{\color{#35bf28}+8.43\\%}$ | | test_exec_td_decorator | 0.3975ms | 0.2890ms | 3.4606 KOps/s | 3.3527 KOps/s | $\color{#35bf28}+3.22\\%$ | | test_vmap_mlp_speed[True-True] | 1.0856ms | 0.7038ms | 1.4209 KOps/s | 1.4300 KOps/s | $\color{#d91a1a}-0.63\\%$ | | test_vmap_mlp_speed[True-False] | 0.7674ms | 0.7019ms | 1.4248 KOps/s | 1.4271 KOps/s | $\color{#d91a1a}-0.16\\%$ | | test_vmap_mlp_speed[False-True] | 0.7635ms | 0.6144ms | 1.6276 KOps/s | 1.6211 KOps/s | $\color{#35bf28}+0.40\\%$ | | test_vmap_mlp_speed[False-False] | 0.6865ms | 0.6165ms | 1.6220 KOps/s | 1.6206 KOps/s | $\color{#35bf28}+0.09\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.4138ms | 0.7775ms | 1.2862 KOps/s | 1.2821 KOps/s | $\color{#35bf28}+0.33\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.8979ms | 0.7416ms | 1.3484 KOps/s | 1.2798 KOps/s | $\textbf{\color{#35bf28}+5.36\\%}$ | | test_vmap_mlp_speed_decorator[False-True] | 0.9094ms | 0.6755ms | 1.4804 KOps/s | 1.4765 KOps/s | $\color{#35bf28}+0.26\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8516ms | 0.6803ms | 1.4700 KOps/s | 1.4676 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_vmap_transformer_speed[True-True] | 9.2441ms | 8.8339ms | 113.2009 Ops/s | 112.1974 Ops/s | $\color{#35bf28}+0.89\\%$ | | test_vmap_transformer_speed[True-False] | 9.0187ms | 8.7687ms | 114.0416 Ops/s | 112.9746 Ops/s | $\color{#35bf28}+0.94\\%$ | | test_vmap_transformer_speed[False-True] | 8.8489ms | 8.6999ms | 114.9436 Ops/s | 113.6264 Ops/s | $\color{#35bf28}+1.16\\%$ | | test_vmap_transformer_speed[False-False] | 10.1743ms | 8.6740ms | 115.2876 Ops/s | 113.4010 Ops/s | $\color{#35bf28}+1.66\\%$ | | test_vmap_transformer_speed_decorator[True-True] | 21.1852ms | 21.0409ms | 47.5264 Ops/s | 47.3535 Ops/s | $\color{#35bf28}+0.37\\%$ | | test_vmap_transformer_speed_decorator[True-False] | 21.6579ms | 21.0226ms | 47.5678 Ops/s | 46.5782 Ops/s | $\color{#35bf28}+2.12\\%$ | | test_vmap_transformer_speed_decorator[False-True] | 20.9626ms | 20.8386ms | 47.9879 Ops/s | 47.7679 Ops/s | $\color{#35bf28}+0.46\\%$ | | test_vmap_transformer_speed_decorator[False-False] | 21.9250ms | 20.9099ms | 47.8243 Ops/s | 47.9058 Ops/s | $\color{#d91a1a}-0.17\\%$ | | test_to_module_speed[True] | 1.5951ms | 1.4815ms | 674.9881 Ops/s | 672.3139 Ops/s | $\color{#35bf28}+0.40\\%$ | | test_to_module_speed[False] | 1.5475ms | 1.4397ms | 694.5812 Ops/s | 690.3331 Ops/s | $\color{#35bf28}+0.62\\%$ | | test_tc_init | 59.3610μs | 39.7699μs | 25.1446 KOps/s | 27.0964 KOps/s | $\textbf{\color{#d91a1a}-7.20\\%}$ | | test_tc_init_nested | 0.1084ms | 80.2205μs | 12.4656 KOps/s | 13.7066 KOps/s | $\textbf{\color{#d91a1a}-9.05\\%}$ | | test_tc_first_layer_tensor | 3.1967μs | 0.7993μs | 1.2511 MOps/s | 1.2772 MOps/s | $\color{#d91a1a}-2.04\\%$ | | test_tc_first_layer_nontensor | 26.0310μs | 2.5476μs | 392.5280 KOps/s | 389.2933 KOps/s | $\color{#35bf28}+0.83\\%$ | | test_tc_second_layer_tensor | 7.9933μs | 1.6412μs | 609.3011 KOps/s | 618.1882 KOps/s | $\color{#d91a1a}-1.44\\%$ | | test_tc_second_layer_nontensor | 20.0600μs | 3.4285μs | 291.6712 KOps/s | 295.7395 KOps/s | $\color{#d91a1a}-1.38\\%$ | | test_unbind | 0.3220s | 13.1132ms | 76.2589 Ops/s | 80.0573 Ops/s | $\color{#d91a1a}-4.74\\%$ | | test_full_like | 0.6590ms | 0.5782ms | 1.7294 KOps/s | 1.7307 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_zeros_like | 0.2722ms | 0.1977ms | 5.0579 KOps/s | 5.0608 KOps/s | $\color{#d91a1a}-0.06\\%$ | | test_ones_like | 0.2178ms | 0.1975ms | 5.0623 KOps/s | 5.0658 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_clone | 0.4540ms | 0.4148ms | 2.4106 KOps/s | 2.4128 KOps/s | $\color{#d91a1a}-0.09\\%$ | | test_squeeze | 29.6610μs | 12.0277μs | 83.1414 KOps/s | 91.4513 KOps/s | $\textbf{\color{#d91a1a}-9.09\\%}$ | | test_unsqueeze | 0.2638ms | 86.1507μs | 11.6076 KOps/s | 12.6771 KOps/s | $\textbf{\color{#d91a1a}-8.44\\%}$ | | test_split | 0.4494ms | 0.1787ms | 5.5949 KOps/s | 5.6267 KOps/s | $\color{#d91a1a}-0.56\\%$ | | test_permute | 0.2359ms | 0.1965ms | 5.0893 KOps/s | 5.0824 KOps/s | $\color{#35bf28}+0.14\\%$ | | test_stack | 1.2481ms | 0.8986ms | 1.1128 KOps/s | 1.1167 KOps/s | $\color{#d91a1a}-0.34\\%$ | | test_cat | 1.2510ms | 1.2315ms | 812.0125 Ops/s | 811.9779 Ops/s | $+0.00\\%$ |
$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests
Total Benchmarks: 213. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}8$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ------------------------------------ | | test_plain_set_nested | 41.2170μs | 21.4924μs | 46.5280 KOps/s | 44.2512 KOps/s | $\textbf{\color{#35bf28}+5.15\\%}$ | | test_plain_set_stack_nested | 58.4700μs | 22.0067μs | 45.4406 KOps/s | 43.3144 KOps/s | $\color{#35bf28}+4.91\\%$ | | test_plain_set_nested_inplace | 83.6460μs | 23.5865μs | 42.3972 KOps/s | 40.2580 KOps/s | $\textbf{\color{#35bf28}+5.31\\%}$ | | test_plain_set_stack_nested_inplace | 0.1141ms | 23.7099μs | 42.1765 KOps/s | 40.0652 KOps/s | $\textbf{\color{#35bf28}+5.27\\%}$ | | test_items | 18.5640μs | 2.7519μs | 363.3841 KOps/s | 389.0732 KOps/s | $\textbf{\color{#d91a1a}-6.60\\%}$ | | test_items_nested | 0.5662ms | 0.3297ms | 3.0327 KOps/s | 2.9302 KOps/s | $\color{#35bf28}+3.50\\%$ | | test_items_nested_locked | 2.2606ms | 0.3304ms | 3.0266 KOps/s | 2.9691 KOps/s | $\color{#35bf28}+1.94\\%$ | | test_items_nested_leaf | 0.1340ms | 86.1337μs | 11.6099 KOps/s | 11.6558 KOps/s | $\color{#d91a1a}-0.39\\%$ | | test_items_stack_nested | 0.4164ms | 0.3321ms | 3.0108 KOps/s | 2.9364 KOps/s | $\color{#35bf28}+2.53\\%$ | | test_items_stack_nested_leaf | 0.1908ms | 87.5155μs | 11.4265 KOps/s | 11.5394 KOps/s | $\color{#d91a1a}-0.98\\%$ | | test_items_stack_nested_locked | 0.3890ms | 0.3307ms | 3.0237 KOps/s | 2.9250 KOps/s | $\color{#35bf28}+3.37\\%$ | | test_keys | 30.3870μs | 3.9314μs | 254.3651 KOps/s | 252.5564 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_keys_nested | 0.2373ms | 0.1428ms | 7.0014 KOps/s | 6.8431 KOps/s | $\color{#35bf28}+2.31\\%$ | | test_keys_nested_locked | 0.7869ms | 0.1495ms | 6.6906 KOps/s | 6.6089 KOps/s | $\color{#35bf28}+1.24\\%$ | | test_keys_nested_leaf | 0.1736ms | 0.1239ms | 8.0679 KOps/s | 7.8972 KOps/s | $\color{#35bf28}+2.16\\%$ | | test_keys_stack_nested | 0.3247ms | 0.1451ms | 6.8934 KOps/s | 6.8878 KOps/s | $\color{#35bf28}+0.08\\%$ | | test_keys_stack_nested_leaf | 0.1728ms | 0.1232ms | 8.1144 KOps/s | 7.9458 KOps/s | $\color{#35bf28}+2.12\\%$ | | test_keys_stack_nested_locked | 0.2180ms | 0.1484ms | 6.7366 KOps/s | 6.5498 KOps/s | $\color{#35bf28}+2.85\\%$ | | test_values | 11.0230μs | 1.1716μs | 853.5013 KOps/s | 852.6122 KOps/s | $\color{#35bf28}+0.10\\%$ | | test_values_nested | 95.8980μs | 50.2474μs | 19.9015 KOps/s | 19.4143 KOps/s | $\color{#35bf28}+2.51\\%$ | | test_values_nested_locked | 89.5470μs | 50.4701μs | 19.8137 KOps/s | 19.3435 KOps/s | $\color{#35bf28}+2.43\\%$ | | test_values_nested_leaf | 0.1099ms | 45.1170μs | 22.1646 KOps/s | 21.3900 KOps/s | $\color{#35bf28}+3.62\\%$ | | test_values_stack_nested | 0.1038ms | 50.3877μs | 19.8461 KOps/s | 18.9862 KOps/s | $\color{#35bf28}+4.53\\%$ | | test_values_stack_nested_leaf | 93.3140μs | 45.2078μs | 22.1201 KOps/s | 21.1815 KOps/s | $\color{#35bf28}+4.43\\%$ | | test_values_stack_nested_locked | 0.1053ms | 50.6045μs | 19.7611 KOps/s | 18.7754 KOps/s | $\textbf{\color{#35bf28}+5.25\\%}$ | | test_membership | 4.5600μs | 0.7590μs | 1.3176 MOps/s | 1.0855 MOps/s | $\textbf{\color{#35bf28}+21.39\\%}$ | | test_membership_nested | 28.6030μs | 2.6358μs | 379.3888 KOps/s | 385.3165 KOps/s | $\color{#d91a1a}-1.54\\%$ | | test_membership_nested_leaf | 31.5190μs | 2.6432μs | 378.3307 KOps/s | 380.4242 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_membership_stacked_nested | 21.6010μs | 2.5884μs | 386.3370 KOps/s | 384.8389 KOps/s | $\color{#35bf28}+0.39\\%$ | | test_membership_stacked_nested_leaf | 25.9080μs | 2.6288μs | 380.3976 KOps/s | 383.2666 KOps/s | $\color{#d91a1a}-0.75\\%$ | | test_membership_nested_last | 36.7380μs | 3.9496μs | 253.1918 KOps/s | 252.1220 KOps/s | $\color{#35bf28}+0.42\\%$ | | test_membership_nested_leaf_last | 44.1530μs | 3.8999μs | 256.4190 KOps/s | 253.0128 KOps/s | $\color{#35bf28}+1.35\\%$ | | test_membership_stacked_nested_last | 31.8190μs | 3.9189μs | 255.1734 KOps/s | 256.7292 KOps/s | $\color{#d91a1a}-0.61\\%$ | | test_membership_stacked_nested_leaf_last | 49.2920μs | 3.9262μs | 254.6965 KOps/s | 254.3001 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_nested_getleaf | 61.1960μs | 10.6042μs | 94.3026 KOps/s | 96.2290 KOps/s | $\color{#d91a1a}-2.00\\%$ | | test_nested_get | 46.3340μs | 9.8185μs | 101.8486 KOps/s | 99.3009 KOps/s | $\color{#35bf28}+2.57\\%$ | | test_stacked_getleaf | 56.7460μs | 10.4838μs | 95.3850 KOps/s | 94.8079 KOps/s | $\color{#35bf28}+0.61\\%$ | | test_stacked_get | 55.2330μs | 10.0773μs | 99.2331 KOps/s | 99.9019 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_nested_getitemleaf | 40.5860μs | 11.1112μs | 89.9997 KOps/s | 89.0682 KOps/s | $\color{#35bf28}+1.05\\%$ | | test_nested_getitem | 51.5160μs | 10.2183μs | 97.8636 KOps/s | 97.1264 KOps/s | $\color{#35bf28}+0.76\\%$ | | test_stacked_getitemleaf | 41.0770μs | 11.1816μs | 89.4324 KOps/s | 90.4551 KOps/s | $\color{#d91a1a}-1.13\\%$ | | test_stacked_getitem | 41.3170μs | 10.3047μs | 97.0427 KOps/s | 97.1546 KOps/s | $\color{#d91a1a}-0.12\\%$ | | test_lock_nested | 6.9480ms | 0.5039ms | 1.9844 KOps/s | 1.9910 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_lock_stack_nested | 0.8795ms | 0.4705ms | 2.1254 KOps/s | 2.1393 KOps/s | $\color{#d91a1a}-0.65\\%$ | | test_unlock_nested | 85.3132ms | 0.5012ms | 1.9954 KOps/s | 2.4122 KOps/s | $\textbf{\color{#d91a1a}-17.28\\%}$ | | test_unlock_stack_nested | 0.8081ms | 0.3859ms | 2.5916 KOps/s | 2.6417 KOps/s | $\color{#d91a1a}-1.90\\%$ | | test_flatten_speed | 0.5983ms | 0.1075ms | 9.3004 KOps/s | 9.5763 KOps/s | $\color{#d91a1a}-2.88\\%$ | | test_unflatten_speed | 0.7538ms | 0.4363ms | 2.2922 KOps/s | 2.3106 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_common_ops | 4.6816ms | 1.0870ms | 919.9313 Ops/s | 899.4513 Ops/s | $\color{#35bf28}+2.28\\%$ | | test_creation | 18.4950μs | 2.0847μs | 479.6856 KOps/s | 452.8328 KOps/s | $\textbf{\color{#35bf28}+5.93\\%}$ | | test_creation_empty | 90.9220μs | 18.0888μs | 55.2828 KOps/s | 51.5838 KOps/s | $\textbf{\color{#35bf28}+7.17\\%}$ | | test_creation_nested_1 | 62.9080μs | 21.2880μs | 46.9747 KOps/s | 43.9787 KOps/s | $\textbf{\color{#35bf28}+6.81\\%}$ | | test_creation_nested_2 | 57.0560μs | 24.6609μs | 40.5501 KOps/s | 38.3141 KOps/s | $\textbf{\color{#35bf28}+5.84\\%}$ | | test_clone | 0.1168ms | 16.7515μs | 59.6963 KOps/s | 61.8243 KOps/s | $\color{#d91a1a}-3.44\\%$ | | test_getitem[int] | 1.2669ms | 16.5975μs | 60.2502 KOps/s | 61.2573 KOps/s | $\color{#d91a1a}-1.64\\%$ | | test_getitem[slice_int] | 0.1426ms | 32.0203μs | 31.2301 KOps/s | 31.4948 KOps/s | $\color{#d91a1a}-0.84\\%$ | | test_getitem[range] | 0.2159ms | 57.3566μs | 17.4348 KOps/s | 17.8454 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_getitem[tuple] | 0.1443ms | 25.6297μs | 39.0172 KOps/s | 40.1289 KOps/s | $\color{#d91a1a}-2.77\\%$ | | test_getitem[list] | 0.2570ms | 51.7408μs | 19.3271 KOps/s | 19.4574 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_setitem_dim[int] | 84.5870μs | 40.5507μs | 24.6605 KOps/s | 24.6699 KOps/s | $\color{#d91a1a}-0.04\\%$ | | test_setitem_dim[slice_int] | 0.1162ms | 72.3974μs | 13.8127 KOps/s | 13.9764 KOps/s | $\color{#d91a1a}-1.17\\%$ | | test_setitem_dim[range] | 0.2020ms | 93.3373μs | 10.7138 KOps/s | 10.8022 KOps/s | $\color{#d91a1a}-0.82\\%$ | | test_setitem_dim[tuple] | 0.1063ms | 59.1859μs | 16.8959 KOps/s | 17.2583 KOps/s | $\color{#d91a1a}-2.10\\%$ | | test_setitem | 0.1476ms | 29.1975μs | 34.2496 KOps/s | 34.2815 KOps/s | $\color{#d91a1a}-0.09\\%$ | | test_set | 0.1238ms | 28.3114μs | 35.3215 KOps/s | 34.6077 KOps/s | $\color{#35bf28}+2.06\\%$ | | test_set_shared | 4.4089ms | 0.2187ms | 4.5731 KOps/s | 4.6562 KOps/s | $\color{#d91a1a}-1.79\\%$ | | test_update | 0.1925ms | 35.2678μs | 28.3544 KOps/s | 27.0858 KOps/s | $\color{#35bf28}+4.68\\%$ | | test_update_nested | 0.1502ms | 45.1782μs | 22.1346 KOps/s | 21.5300 KOps/s | $\color{#35bf28}+2.81\\%$ | | test_update__nested | 0.1425ms | 33.3727μs | 29.9646 KOps/s | 30.3759 KOps/s | $\color{#d91a1a}-1.35\\%$ | | test_set_nested | 0.1054ms | 30.7397μs | 32.5313 KOps/s | 32.0279 KOps/s | $\color{#35bf28}+1.57\\%$ | | test_set_nested_new | 0.1563ms | 35.8121μs | 27.9236 KOps/s | 27.6035 KOps/s | $\color{#35bf28}+1.16\\%$ | | test_select | 1.0924ms | 52.4995μs | 19.0478 KOps/s | 19.2770 KOps/s | $\color{#d91a1a}-1.19\\%$ | | test_select_nested | 0.1400ms | 59.4332μs | 16.8256 KOps/s | 17.0086 KOps/s | $\color{#d91a1a}-1.08\\%$ | | test_exclude_nested | 0.1313ms | 76.9193μs | 13.0006 KOps/s | 12.9380 KOps/s | $\color{#35bf28}+0.48\\%$ | | test_empty[True] | 0.9096ms | 0.3242ms | 3.0849 KOps/s | 3.0973 KOps/s | $\color{#d91a1a}-0.40\\%$ | | test_empty[False] | 32.4500μs | 1.3325μs | 750.4414 KOps/s | 847.8068 KOps/s | $\textbf{\color{#d91a1a}-11.48\\%}$ | | test_unbind_speed | 0.3702ms | 0.3065ms | 3.2629 KOps/s | 3.2744 KOps/s | $\color{#d91a1a}-0.35\\%$ | | test_unbind_speed_stack0 | 0.4355ms | 0.3031ms | 3.2997 KOps/s | 3.3594 KOps/s | $\color{#d91a1a}-1.78\\%$ | | test_unbind_speed_stack1 | 89.3736ms | 0.7958ms | 1.2565 KOps/s | 1.3851 KOps/s | $\textbf{\color{#d91a1a}-9.28\\%}$ | | test_split | 88.2951ms | 2.1306ms | 469.3446 Ops/s | 472.8040 Ops/s | $\color{#d91a1a}-0.73\\%$ | | test_chunk | 86.7099ms | 2.1300ms | 469.4894 Ops/s | 471.1353 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_creation[device0] | 0.2487ms | 0.1191ms | 8.3969 KOps/s | 8.4993 KOps/s | $\color{#d91a1a}-1.20\\%$ | | test_creation_from_tensor | 4.9432ms | 0.1214ms | 8.2350 KOps/s | 8.3531 KOps/s | $\color{#d91a1a}-1.41\\%$ | | test_add_one[memmap_tensor0] | 0.2364ms | 8.3708μs | 119.4627 KOps/s | 126.9046 KOps/s | $\textbf{\color{#d91a1a}-5.86\\%}$ | | test_contiguous[memmap_tensor0] | 30.4270μs | 2.0465μs | 488.6482 KOps/s | 491.5271 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_stack[memmap_tensor0] | 42.3390μs | 5.9479μs | 168.1264 KOps/s | 174.8376 KOps/s | $\color{#d91a1a}-3.84\\%$ | | test_memmaptd_index | 1.0430ms | 0.4117ms | 2.4289 KOps/s | 2.4070 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_memmaptd_index_astensor | 0.9076ms | 0.4935ms | 2.0264 KOps/s | 2.0259 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_memmaptd_index_op | 1.8630ms | 1.0549ms | 947.9876 Ops/s | 943.2386 Ops/s | $\color{#35bf28}+0.50\\%$ | | test_serialize_model | 0.1360s | 0.1294s | 7.7306 Ops/s | 6.8864 Ops/s | $\textbf{\color{#35bf28}+12.26\\%}$ | | test_serialize_model_pickle | 0.4412s | 0.3920s | 2.5513 Ops/s | 2.5117 Ops/s | $\color{#35bf28}+1.58\\%$ | | test_serialize_weights | 0.1376s | 0.1244s | 8.0360 Ops/s | 7.8362 Ops/s | $\color{#35bf28}+2.55\\%$ | | test_serialize_weights_returnearly | 0.1833s | 0.1670s | 5.9895 Ops/s | 5.8761 Ops/s | $\color{#35bf28}+1.93\\%$ | | test_serialize_weights_pickle | 1.2974s | 0.7448s | 1.3427 Ops/s | 2.3441 Ops/s | $\textbf{\color{#d91a1a}-42.72\\%}$ | | test_serialize_weights_filesystem | 0.2378s | 0.1583s | 6.3186 Ops/s | 6.2731 Ops/s | $\color{#35bf28}+0.73\\%$ | | test_serialize_model_filesystem | 0.1589s | 0.1475s | 6.7787 Ops/s | 6.4752 Ops/s | $\color{#35bf28}+4.69\\%$ | | test_reshape_pytree | 98.6940μs | 40.1170μs | 24.9271 KOps/s | 23.9297 KOps/s | $\color{#35bf28}+4.17\\%$ | | test_reshape_td | 95.6790μs | 46.4048μs | 21.5495 KOps/s | 21.4015 KOps/s | $\color{#35bf28}+0.69\\%$ | | test_view_pytree | 0.1256ms | 40.8122μs | 24.5025 KOps/s | 24.5922 KOps/s | $\color{#d91a1a}-0.36\\%$ | | test_view_td | 0.1117ms | 53.5771μs | 18.6647 KOps/s | 19.0338 KOps/s | $\color{#d91a1a}-1.94\\%$ | | test_unbind_pytree | 0.1184ms | 37.3983μs | 26.7392 KOps/s | 26.8649 KOps/s | $\color{#d91a1a}-0.47\\%$ | | test_unbind_td | 0.3712ms | 45.9561μs | 21.7599 KOps/s | 21.7301 KOps/s | $\color{#35bf28}+0.14\\%$ | | test_split_pytree | 85.9700μs | 40.9186μs | 24.4388 KOps/s | 25.2693 KOps/s | $\color{#d91a1a}-3.29\\%$ | | test_split_td | 0.4702ms | 58.4422μs | 17.1109 KOps/s | 16.7449 KOps/s | $\color{#35bf28}+2.19\\%$ | | test_add_pytree | 97.2620μs | 49.4262μs | 20.2322 KOps/s | 20.9234 KOps/s | $\color{#d91a1a}-3.30\\%$ | | test_add_td | 0.1641ms | 84.4518μs | 11.8411 KOps/s | 11.4845 KOps/s | $\color{#35bf28}+3.10\\%$ | | test_compile_add_one_nested[tensordict-compile] | 0.1466ms | 54.3259μs | 18.4074 KOps/s | 19.2223 KOps/s | $\color{#d91a1a}-4.24\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.4639ms | 0.1900ms | 5.2634 KOps/s | 5.3822 KOps/s | $\color{#d91a1a}-2.21\\%$ | | test_compile_add_one_nested[pytree-compile] | 0.1440ms | 55.3241μs | 18.0753 KOps/s | 18.8869 KOps/s | $\color{#d91a1a}-4.30\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.3369ms | 0.1512ms | 6.6134 KOps/s | 6.3910 KOps/s | $\color{#35bf28}+3.48\\%$ | | test_compile_copy_nested[tensordict-compile] | 63.4880μs | 20.2892μs | 49.2872 KOps/s | 50.2824 KOps/s | $\color{#d91a1a}-1.98\\%$ | | test_compile_copy_nested[tensordict-eager] | 0.1428ms | 65.9558μs | 15.1617 KOps/s | 15.5250 KOps/s | $\color{#d91a1a}-2.34\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1569ms | 81.2239μs | 12.3117 KOps/s | 12.5360 KOps/s | $\color{#d91a1a}-1.79\\%$ | | test_compile_copy_nested[pytree-eager] | 0.1397ms | 73.5070μs | 13.6041 KOps/s | 13.6186 KOps/s | $\color{#d91a1a}-0.11\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.3635ms | 0.1752ms | 5.7090 KOps/s | 5.6787 KOps/s | $\color{#35bf28}+0.53\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.4288ms | 0.1928ms | 5.1864 KOps/s | 5.2263 KOps/s | $\color{#d91a1a}-0.76\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.1053ms | 38.7734μs | 25.7909 KOps/s | 26.0493 KOps/s | $\color{#d91a1a}-0.99\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 1.3090ms | 69.9935μs | 14.2870 KOps/s | 14.9603 KOps/s | $\color{#d91a1a}-4.50\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.2458ms | 0.1766ms | 5.6619 KOps/s | 5.6646 KOps/s | $\color{#d91a1a}-0.05\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.5731ms | 0.3043ms | 3.2865 KOps/s | 3.3877 KOps/s | $\color{#d91a1a}-2.99\\%$ | | test_compile_add_self_flat[tensordict-eager] | 0.3955ms | 0.2142ms | 4.6684 KOps/s | 4.7922 KOps/s | $\color{#d91a1a}-2.58\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.3639ms | 0.1781ms | 5.6158 KOps/s | 5.6471 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.7843ms | 63.0057μs | 15.8716 KOps/s | 16.0136 KOps/s | $\color{#d91a1a}-0.89\\%$ | | test_compile_add_self_flat[tensorclass-compile] | 0.1091ms | 40.5903μs | 24.6365 KOps/s | 26.1067 KOps/s | $\textbf{\color{#d91a1a}-5.63\\%}$ | | test_compile_add_self_flat[pytree-eager] | 0.4438ms | 0.2485ms | 4.0247 KOps/s | 4.0741 KOps/s | $\color{#d91a1a}-1.21\\%$ | | test_compile_add_self_flat[pytree-compile] | 0.2852ms | 0.1749ms | 5.7164 KOps/s | 5.6866 KOps/s | $\color{#35bf28}+0.52\\%$ | | test_compile_copy_flat[tensordict-compile] | 0.2379ms | 0.1090ms | 9.1772 KOps/s | 9.1977 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_compile_copy_flat[tensordict-eager] | 0.1290ms | 56.5284μs | 17.6902 KOps/s | 18.1030 KOps/s | $\color{#d91a1a}-2.28\\%$ | | test_compile_copy_flat[pytree-compile] | 0.1493ms | 79.4068μs | 12.5934 KOps/s | 12.4370 KOps/s | $\color{#35bf28}+1.26\\%$ | | test_compile_copy_flat[pytree-eager] | 0.1348ms | 70.3638μs | 14.2118 KOps/s | 13.7334 KOps/s | $\color{#35bf28}+3.48\\%$ | | test_compile_assign_and_add[tensordict-compile] | 0.2734ms | 0.1917ms | 5.2177 KOps/s | 5.1827 KOps/s | $\color{#35bf28}+0.67\\%$ | | test_compile_assign_and_add[tensordict-eager] | 1.9593ms | 1.6445ms | 608.0873 Ops/s | 603.1443 Ops/s | $\color{#35bf28}+0.82\\%$ | | test_compile_assign_and_add[pytree-compile] | 0.2922ms | 0.1880ms | 5.3205 KOps/s | 5.2607 KOps/s | $\color{#35bf28}+1.14\\%$ | | test_compile_assign_and_add[pytree-eager] | 1.4012ms | 1.1061ms | 904.0770 Ops/s | 901.1693 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_compile_assign_and_add_stack[compile] | 0.5324ms | 0.4153ms | 2.4079 KOps/s | 2.3832 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_compile_assign_and_add_stack[eager] | 5.1557ms | 3.8388ms | 260.4960 Ops/s | 255.2031 Ops/s | $\color{#35bf28}+2.07\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 78.8570μs | 32.0882μs | 31.1641 KOps/s | 32.5648 KOps/s | $\color{#d91a1a}-4.30\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 0.6057ms | 49.7573μs | 20.0975 KOps/s | 20.6018 KOps/s | $\color{#d91a1a}-2.45\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 73.1970μs | 28.3985μs | 35.2131 KOps/s | 35.9514 KOps/s | $\color{#d91a1a}-2.05\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 0.1034ms | 30.8843μs | 32.3789 KOps/s | 31.7968 KOps/s | $\color{#35bf28}+1.83\\%$ | | test_compile_indexing[tensor-pytree-compile] | 73.2360μs | 27.8043μs | 35.9657 KOps/s | 36.4497 KOps/s | $\color{#d91a1a}-1.33\\%$ | | test_compile_indexing[tensor-pytree-eager] | 0.1285ms | 30.4527μs | 32.8379 KOps/s | 32.7840 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_compile_indexing[slice-tensordict-compile] | 0.1479ms | 72.0738μs | 13.8747 KOps/s | 14.1599 KOps/s | $\color{#d91a1a}-2.01\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.5490ms | 28.5341μs | 35.0458 KOps/s | 35.3472 KOps/s | $\color{#d91a1a}-0.85\\%$ | | test_compile_indexing[slice-tensorclass-compile] | 0.1425ms | 67.7560μs | 14.7588 KOps/s | 15.0626 KOps/s | $\color{#d91a1a}-2.02\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 68.1970μs | 24.7244μs | 40.4458 KOps/s | 41.6282 KOps/s | $\color{#d91a1a}-2.84\\%$ | | test_compile_indexing[slice-pytree-compile] | 0.1762ms | 67.8437μs | 14.7398 KOps/s | 14.9113 KOps/s | $\color{#d91a1a}-1.15\\%$ | | test_compile_indexing[slice-pytree-eager] | 3.6853ms | 24.7841μs | 40.3485 KOps/s | 41.0984 KOps/s | $\color{#d91a1a}-1.82\\%$ | | test_compile_indexing[int-tensordict-compile] | 0.1536ms | 71.9094μs | 13.9064 KOps/s | 14.2128 KOps/s | $\color{#d91a1a}-2.16\\%$ | | test_compile_indexing[int-tensordict-eager] | 0.6388ms | 28.2230μs | 35.4321 KOps/s | 35.1046 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_compile_indexing[int-tensorclass-compile] | 0.1469ms | 67.4197μs | 14.8325 KOps/s | 15.1063 KOps/s | $\color{#d91a1a}-1.81\\%$ | | test_compile_indexing[int-tensorclass-eager] | 93.7220μs | 24.2668μs | 41.2085 KOps/s | 41.8883 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_compile_indexing[int-pytree-compile] | 0.1585ms | 67.8037μs | 14.7485 KOps/s | 15.0501 KOps/s | $\color{#d91a1a}-2.00\\%$ | | test_compile_indexing[int-pytree-eager] | 65.6630μs | 24.5073μs | 40.8041 KOps/s | 40.4183 KOps/s | $\color{#35bf28}+0.95\\%$ | | test_mod_add[eager] | 75.1200μs | 25.2904μs | 39.5407 KOps/s | 40.5030 KOps/s | $\color{#d91a1a}-2.38\\%$ | | test_mod_add[compile] | 0.1006ms | 36.5376μs | 27.3691 KOps/s | 28.5024 KOps/s | $\color{#d91a1a}-3.98\\%$ | | test_mod_add[compile-overhead] | 91.2200μs | 36.7796μs | 27.1890 KOps/s | 27.7520 KOps/s | $\color{#d91a1a}-2.03\\%$ | | test_mod_wrap[eager] | 0.3141ms | 0.2012ms | 4.9693 KOps/s | 4.8462 KOps/s | $\color{#35bf28}+2.54\\%$ | | test_mod_wrap[compile] | 1.5340ms | 0.2250ms | 4.4443 KOps/s | 4.3268 KOps/s | $\color{#35bf28}+2.72\\%$ | | test_mod_wrap[compile-overhead] | 0.4467ms | 0.2235ms | 4.4750 KOps/s | 4.3884 KOps/s | $\color{#35bf28}+1.97\\%$ | | test_mod_wrap_and_backward[eager] | 12.1663ms | 10.8690ms | 92.0048 Ops/s | 87.5151 Ops/s | $\textbf{\color{#35bf28}+5.13\\%}$ | | test_mod_wrap_and_backward[compile] | 12.2193ms | 10.9854ms | 91.0300 Ops/s | 85.8012 Ops/s | $\textbf{\color{#35bf28}+6.09\\%}$ | | test_mod_wrap_and_backward[compile-overhead] | 12.1795ms | 10.9084ms | 91.6721 Ops/s | 85.4932 Ops/s | $\textbf{\color{#35bf28}+7.23\\%}$ | | test_seq_add[eager] | 0.1730ms | 84.9566μs | 11.7707 KOps/s | 11.3478 KOps/s | $\color{#35bf28}+3.73\\%$ | | test_seq_add[compile] | 0.1570ms | 59.7624μs | 16.7329 KOps/s | 16.8101 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_seq_add[compile-overhead] | 0.1562ms | 59.6049μs | 16.7772 KOps/s | 17.2337 KOps/s | $\color{#d91a1a}-2.65\\%$ | | test_seq_wrap[eager] | 0.5479ms | 0.3663ms | 2.7300 KOps/s | 2.7071 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_seq_wrap[compile] | 0.4047ms | 0.2588ms | 3.8646 KOps/s | 3.8242 KOps/s | $\color{#35bf28}+1.06\\%$ | | test_seq_wrap[compile-overhead] | 0.5016ms | 0.2621ms | 3.8152 KOps/s | 3.8141 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_func_call_runtime[False-eager] | 0.7028ms | 0.5096ms | 1.9622 KOps/s | 1.9666 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_func_call_runtime[False-compile] | 0.8582ms | 0.4929ms | 2.0288 KOps/s | 1.9723 KOps/s | $\color{#35bf28}+2.87\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.8532ms | 0.4913ms | 2.0355 KOps/s | 2.0040 KOps/s | $\color{#35bf28}+1.57\\%$ | | test_func_call_runtime[True-eager] | 1.2966ms | 0.8166ms | 1.2246 KOps/s | 1.2240 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_func_call_runtime[True-compile] | 0.9914ms | 0.5087ms | 1.9660 KOps/s | 1.9310 KOps/s | $\color{#35bf28}+1.81\\%$ | | test_func_call_runtime[True-compile-overhead] | 0.9357ms | 0.5071ms | 1.9721 KOps/s | 1.9055 KOps/s | $\color{#35bf28}+3.49\\%$ | | test_distributed | 0.2797ms | 0.1311ms | 7.6277 KOps/s | 7.4951 KOps/s | $\color{#35bf28}+1.77\\%$ | | test_tdmodule | 82.9650μs | 16.5096μs | 60.5709 KOps/s | 59.0880 KOps/s | $\color{#35bf28}+2.51\\%$ | | test_tdmodule_dispatch | 71.9440μs | 34.8383μs | 28.7041 KOps/s | 27.3812 KOps/s | $\color{#35bf28}+4.83\\%$ | | test_tdseq | 33.1010μs | 18.5916μs | 53.7877 KOps/s | 51.9667 KOps/s | $\color{#35bf28}+3.50\\%$ | | test_tdseq_dispatch | 70.2600μs | 38.3816μs | 26.0542 KOps/s | 24.2186 KOps/s | $\textbf{\color{#35bf28}+7.58\\%}$ | | test_instantiation_functorch | 1.8216ms | 1.6342ms | 611.9388 Ops/s | 603.4810 Ops/s | $\color{#35bf28}+1.40\\%$ | | test_instantiation_td | 1.8577ms | 1.1801ms | 847.3721 Ops/s | 844.1625 Ops/s | $\color{#35bf28}+0.38\\%$ | | test_exec_functorch | 0.3226ms | 0.1778ms | 5.6247 KOps/s | 5.7156 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_exec_functional_call | 0.3279ms | 0.1670ms | 5.9884 KOps/s | 6.0262 KOps/s | $\color{#d91a1a}-0.63\\%$ | | test_exec_td | 0.4174ms | 0.1674ms | 5.9733 KOps/s | 5.9913 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_exec_td_decorator | 1.1150ms | 0.2547ms | 3.9268 KOps/s | 4.0282 KOps/s | $\color{#d91a1a}-2.52\\%$ | | test_vmap_mlp_speed[True-True] | 0.9030ms | 0.5900ms | 1.6949 KOps/s | 1.6848 KOps/s | $\color{#35bf28}+0.60\\%$ | | test_vmap_mlp_speed[True-False] | 0.8654ms | 0.5868ms | 1.7041 KOps/s | 1.7073 KOps/s | $\color{#d91a1a}-0.19\\%$ | | test_vmap_mlp_speed[False-True] | 0.7752ms | 0.4836ms | 2.0680 KOps/s | 2.0768 KOps/s | $\color{#d91a1a}-0.42\\%$ | | test_vmap_mlp_speed[False-False] | 1.0224ms | 0.4847ms | 2.0633 KOps/s | 2.0642 KOps/s | $\color{#d91a1a}-0.04\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.4381ms | 0.6858ms | 1.4582 KOps/s | 1.4604 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.9856ms | 0.6809ms | 1.4687 KOps/s | 1.4718 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8975ms | 0.5695ms | 1.7560 KOps/s | 1.7665 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8080ms | 0.5619ms | 1.7797 KOps/s | 1.7774 KOps/s | $\color{#35bf28}+0.13\\%$ | | test_to_module_speed[True] | 2.5569ms | 1.7886ms | 559.0824 Ops/s | 541.0632 Ops/s | $\color{#35bf28}+3.33\\%$ | | test_to_module_speed[False] | 2.3988ms | 1.7473ms | 572.3114 Ops/s | 555.4801 Ops/s | $\color{#35bf28}+3.03\\%$ | | test_tc_init | 0.1092ms | 43.6874μs | 22.8899 KOps/s | 21.1356 KOps/s | $\textbf{\color{#35bf28}+8.30\\%}$ | | test_tc_init_nested | 0.1538ms | 88.3932μs | 11.3131 KOps/s | 10.7235 KOps/s | $\textbf{\color{#35bf28}+5.50\\%}$ | | test_tc_first_layer_tensor | 25.6880μs | 1.4629μs | 683.5740 KOps/s | 681.1701 KOps/s | $\color{#35bf28}+0.35\\%$ | | test_tc_first_layer_nontensor | 34.4540μs | 4.2889μs | 233.1598 KOps/s | 232.3402 KOps/s | $\color{#35bf28}+0.35\\%$ | | test_tc_second_layer_tensor | 52.6180μs | 2.6805μs | 373.0644 KOps/s | 359.9128 KOps/s | $\color{#35bf28}+3.65\\%$ | | test_tc_second_layer_nontensor | 31.6790μs | 5.4989μs | 181.8545 KOps/s | 178.9318 KOps/s | $\color{#35bf28}+1.63\\%$ | | test_unbind | 0.4488s | 13.9811ms | 71.5252 Ops/s | 76.4438 Ops/s | $\textbf{\color{#d91a1a}-6.43\\%}$ | | test_full_like | 9.1341ms | 7.6510ms | 130.7026 Ops/s | 135.4509 Ops/s | $\color{#d91a1a}-3.51\\%$ | | test_zeros_like | 3.6028ms | 2.9880ms | 334.6757 Ops/s | 158.8780 Ops/s | $\textbf{\color{#35bf28}+110.65\\%}$ | | test_ones_like | 3.8305ms | 3.4338ms | 291.2208 Ops/s | 134.1630 Ops/s | $\textbf{\color{#35bf28}+117.06\\%}$ | | test_clone | 6.1601ms | 5.3875ms | 185.6146 Ops/s | 108.0878 Ops/s | $\textbf{\color{#35bf28}+71.73\\%}$ | | test_squeeze | 66.5440μs | 13.2488μs | 75.4785 KOps/s | 74.2152 KOps/s | $\color{#35bf28}+1.70\\%$ | | test_unsqueeze | 0.1947ms | 94.3008μs | 10.6044 KOps/s | 10.8408 KOps/s | $\color{#d91a1a}-2.18\\%$ | | test_split | 0.5164ms | 0.2028ms | 4.9307 KOps/s | 5.0158 KOps/s | $\color{#d91a1a}-1.70\\%$ | | test_permute | 0.3574ms | 0.2199ms | 4.5479 KOps/s | 4.5297 KOps/s | $\color{#35bf28}+0.40\\%$ | | test_stack | 28.6270ms | 25.5388ms | 39.1562 Ops/s | 38.8821 Ops/s | $\color{#35bf28}+0.70\\%$ | | test_cat | 28.9697ms | 25.2547ms | 39.5966 Ops/s | 39.8943 Ops/s | $\color{#d91a1a}-0.75\\%$ |