issues
search
pytorch
/
tensordict
TensorDict is a pytorch dedicated tensor container.
MIT License
832
stars
74
forks
source link
[BugFix] fix tensorclass set
#854
Closed
vmoens
closed
4 months ago
github-actions[bot]
commented
4 months ago
$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests
Total Benchmarks: 144. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}8$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------ | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 38.9720μs | 17.0447μs | 58.6692 KOps/s | 58.1698 KOps/s | $\color{#35bf28}+0.86\\%$ | | test_plain_set_stack_nested | 41.1170μs | 17.4423μs | 57.3320 KOps/s | 57.3406 KOps/s | $\color{#d91a1a}-0.01\\%$ | | test_plain_set_nested_inplace | 54.5520μs | 19.4819μs | 51.3296 KOps/s | 47.4723 KOps/s | $\textbf{\color{#35bf28}+8.13\\%}$ | | test_plain_set_stack_nested_inplace | 51.2560μs | 19.6777μs | 50.8191 KOps/s | 50.9982 KOps/s | $\color{#d91a1a}-0.35\\%$ | | test_items | 25.4180μs | 2.5610μs | 390.4676 KOps/s | 383.4313 KOps/s | $\color{#35bf28}+1.84\\%$ | | test_items_nested | 0.7706ms | 0.2909ms | 3.4380 KOps/s | 3.6096 KOps/s | $\color{#d91a1a}-4.75\\%$ | | test_items_nested_locked | 0.7973ms | 0.2898ms | 3.4505 KOps/s | 3.6189 KOps/s | $\color{#d91a1a}-4.65\\%$ | | test_items_nested_leaf | 0.1401ms | 79.9657μs | 12.5054 KOps/s | 12.3612 KOps/s | $\color{#35bf28}+1.17\\%$ | | test_items_stack_nested | 0.4340ms | 0.2906ms | 3.4414 KOps/s | 3.5347 KOps/s | $\color{#d91a1a}-2.64\\%$ | | test_items_stack_nested_leaf | 0.1278ms | 79.7727μs | 12.5356 KOps/s | 12.4673 KOps/s | $\color{#35bf28}+0.55\\%$ | | test_items_stack_nested_locked | 0.9620ms | 0.2862ms | 3.4938 KOps/s | 3.6324 KOps/s | $\color{#d91a1a}-3.82\\%$ | | test_keys | 39.0630μs | 3.8193μs | 261.8296 KOps/s | 263.5583 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_keys_nested | 0.2419ms | 0.1397ms | 7.1557 KOps/s | 7.1759 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_keys_nested_locked | 0.7762ms | 0.1442ms | 6.9363 KOps/s | 6.9062 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_keys_nested_leaf | 0.2004ms | 0.1186ms | 8.4286 KOps/s | 8.4676 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_keys_stack_nested | 0.6427ms | 0.1437ms | 6.9601 KOps/s | 7.1835 KOps/s | $\color{#d91a1a}-3.11\\%$ | | test_keys_stack_nested_leaf | 0.2069ms | 0.1160ms | 8.6224 KOps/s | 8.4660 KOps/s | $\color{#35bf28}+1.85\\%$ | | test_keys_stack_nested_locked | 0.2582ms | 0.1415ms | 7.0677 KOps/s | 7.0006 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_values | 6.9570μs | 1.1635μs | 859.4545 KOps/s | 860.7141 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_values_nested | 0.1013ms | 53.2745μs | 18.7707 KOps/s | 19.7069 KOps/s | $\color{#d91a1a}-4.75\\%$ | | test_values_nested_locked | 0.3009ms | 52.1465μs | 19.1767 KOps/s | 18.9451 KOps/s | $\color{#35bf28}+1.22\\%$ | | test_values_nested_leaf | 87.3430μs | 47.7826μs | 20.9281 KOps/s | 21.8136 KOps/s | $\color{#d91a1a}-4.06\\%$ | | test_values_stack_nested | 0.1966ms | 53.1679μs | 18.8083 KOps/s | 19.4933 KOps/s | $\color{#d91a1a}-3.51\\%$ | | test_values_stack_nested_leaf | 91.8910μs | 47.6639μs | 20.9802 KOps/s | 21.8074 KOps/s | $\color{#d91a1a}-3.79\\%$ | | test_values_stack_nested_locked | 0.1368ms | 53.6921μs | 18.6247 KOps/s | 19.2872 KOps/s | $\color{#d91a1a}-3.43\\%$ | | test_membership | 14.2270μs | 1.3736μs | 727.9892 KOps/s | 730.1718 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_membership_nested | 38.1010μs | 3.4091μs | 293.3325 KOps/s | 282.0785 KOps/s | $\color{#35bf28}+3.99\\%$ | | test_membership_nested_leaf | 52.9680μs | 3.4455μs | 290.2361 KOps/s | 284.0741 KOps/s | $\color{#35bf28}+2.17\\%$ | | test_membership_stacked_nested | 19.2760μs | 3.4068μs | 293.5293 KOps/s | 268.8203 KOps/s | $\textbf{\color{#35bf28}+9.19\\%}$ | | test_membership_stacked_nested_leaf | 20.3380μs | 3.4332μs | 291.2693 KOps/s | 285.6686 KOps/s | $\color{#35bf28}+1.96\\%$ | | test_membership_nested_last | 23.6640μs | 4.1864μs | 238.8690 KOps/s | 238.9223 KOps/s | $\color{#d91a1a}-0.02\\%$ | | test_membership_nested_leaf_last | 33.1920μs | 4.2033μs | 237.9065 KOps/s | 238.5266 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_membership_stacked_nested_last | 26.8500μs | 5.2906μs | 189.0153 KOps/s | 237.5619 KOps/s | $\textbf{\color{#d91a1a}-20.44\\%}$ | | test_membership_stacked_nested_leaf_last | 30.0970μs | 5.3702μs | 186.2130 KOps/s | 233.3752 KOps/s | $\textbf{\color{#d91a1a}-20.21\\%}$ | | test_nested_getleaf | 34.5450μs | 10.9552μs | 91.2813 KOps/s | 94.1394 KOps/s | $\color{#d91a1a}-3.04\\%$ | | test_nested_get | 36.6590μs | 10.4468μs | 95.7234 KOps/s | 98.5885 KOps/s | $\color{#d91a1a}-2.91\\%$ | | test_stacked_getleaf | 29.9650μs | 10.8043μs | 92.5556 KOps/s | 93.3781 KOps/s | $\color{#d91a1a}-0.88\\%$ | | test_stacked_get | 41.5580μs | 10.2327μs | 97.7255 KOps/s | 98.6858 KOps/s | $\color{#d91a1a}-0.97\\%$ | | test_nested_getitemleaf | 33.3530μs | 11.5929μs | 86.2594 KOps/s | 88.5812 KOps/s | $\color{#d91a1a}-2.62\\%$ | | test_nested_getitem | 44.2530μs | 10.9157μs | 91.6112 KOps/s | 95.7400 KOps/s | $\color{#d91a1a}-4.31\\%$ | | test_stacked_getitemleaf | 34.5650μs | 11.5044μs | 86.9235 KOps/s | 88.7281 KOps/s | $\color{#d91a1a}-2.03\\%$ | | test_stacked_getitem | 31.5390μs | 10.5677μs | 94.6281 KOps/s | 95.8396 KOps/s | $\color{#d91a1a}-1.26\\%$ | | test_lock_nested | 52.0958ms | 0.3868ms | 2.5853 KOps/s | 3.0003 KOps/s | $\textbf{\color{#d91a1a}-13.83\\%}$ | | test_lock_stack_nested | 0.5746ms | 0.2991ms | 3.3436 KOps/s | 3.3191 KOps/s | $\color{#35bf28}+0.74\\%$ | | test_unlock_nested | 0.7785ms | 0.3336ms | 2.9972 KOps/s | 2.9421 KOps/s | $\color{#35bf28}+1.87\\%$ | | test_unlock_stack_nested | 0.4795ms | 0.3054ms | 3.2741 KOps/s | 3.2022 KOps/s | $\color{#35bf28}+2.25\\%$ | | test_flatten_speed | 0.2211ms | 0.1004ms | 9.9588 KOps/s | 9.9914 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_unflatten_speed | 0.7384ms | 0.4194ms | 2.3842 KOps/s | 2.4100 KOps/s | $\color{#d91a1a}-1.07\\%$ | | test_common_ops | 1.3975ms | 0.7216ms | 1.3859 KOps/s | 1.3313 KOps/s | $\color{#35bf28}+4.10\\%$ | | test_creation | 19.4770μs | 1.9099μs | 523.5746 KOps/s | 514.9879 KOps/s | $\color{#35bf28}+1.67\\%$ | | test_creation_empty | 35.1660μs | 10.8906μs | 91.8225 KOps/s | 83.3974 KOps/s | $\textbf{\color{#35bf28}+10.10\\%}$ | | test_creation_nested_1 | 40.5760μs | 13.5898μs | 73.5845 KOps/s | 67.1775 KOps/s | $\textbf{\color{#35bf28}+9.54\\%}$ | | test_creation_nested_2 | 71.0330μs | 17.2119μs | 58.0992 KOps/s | 54.8957 KOps/s | $\textbf{\color{#35bf28}+5.84\\%}$ | | test_clone | 71.4330μs | 13.2412μs | 75.5217 KOps/s | 76.0746 KOps/s | $\color{#d91a1a}-0.73\\%$ | | test_getitem[int] | 35.1260μs | 11.0271μs | 90.6854 KOps/s | 89.9607 KOps/s | $\color{#35bf28}+0.81\\%$ | | test_getitem[slice_int] | 70.1810μs | 22.0579μs | 45.3353 KOps/s | 44.7504 KOps/s | $\color{#35bf28}+1.31\\%$ | | test_getitem[range] | 74.3690μs | 58.5854μs | 17.0691 KOps/s | 16.7243 KOps/s | $\color{#35bf28}+2.06\\%$ | | test_getitem[tuple] | 54.8620μs | 18.2615μs | 54.7599 KOps/s | 54.9844 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_getitem[list] | 0.1024ms | 39.3588μs | 25.4073 KOps/s | 25.6704 KOps/s | $\color{#d91a1a}-1.02\\%$ | | test_setitem_dim[int] | 62.3560μs | 32.8905μs | 30.4039 KOps/s | 29.4046 KOps/s | $\color{#35bf28}+3.40\\%$ | | test_setitem_dim[slice_int] | 99.7960μs | 59.0009μs | 16.9489 KOps/s | 16.8951 KOps/s | $\color{#35bf28}+0.32\\%$ | | test_setitem_dim[range] | 0.1707ms | 82.1899μs | 12.1669 KOps/s | 11.8298 KOps/s | $\color{#35bf28}+2.85\\%$ | | test_setitem_dim[tuple] | 0.1000ms | 48.5660μs | 20.5905 KOps/s | 20.1249 KOps/s | $\color{#35bf28}+2.31\\%$ | | test_setitem | 57.5170μs | 19.8832μs | 50.2937 KOps/s | 48.1999 KOps/s | $\color{#35bf28}+4.34\\%$ | | test_set | 71.8240μs | 19.7216μs | 50.7059 KOps/s | 49.5931 KOps/s | $\color{#35bf28}+2.24\\%$ | | test_set_shared | 3.4628ms | 0.1429ms | 6.9966 KOps/s | 6.7196 KOps/s | $\color{#35bf28}+4.12\\%$ | | test_update | 0.1400ms | 22.4098μs | 44.6233 KOps/s | 42.4553 KOps/s | $\textbf{\color{#35bf28}+5.11\\%}$ | | test_update_nested | 0.1338ms | 30.7450μs | 32.5256 KOps/s | 31.2497 KOps/s | $\color{#35bf28}+4.08\\%$ | | test_update__nested | 0.2606ms | 26.4711μs | 37.7770 KOps/s | 40.1777 KOps/s | $\textbf{\color{#d91a1a}-5.98\\%}$ | | test_set_nested | 69.2490μs | 21.2201μs | 47.1251 KOps/s | 44.7199 KOps/s | $\textbf{\color{#35bf28}+5.38\\%}$ | | test_set_nested_new | 62.1560μs | 25.5187μs | 39.1869 KOps/s | 37.3989 KOps/s | $\color{#35bf28}+4.78\\%$ | | test_select | 0.1062ms | 40.3864μs | 24.7608 KOps/s | 23.6177 KOps/s | $\color{#35bf28}+4.84\\%$ | | test_select_nested | 0.1129ms | 57.7906μs | 17.3038 KOps/s | 16.8453 KOps/s | $\color{#35bf28}+2.72\\%$ | | test_exclude_nested | 0.2177ms | 0.1196ms | 8.3594 KOps/s | 8.3873 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_empty[True] | 0.6056ms | 0.3960ms | 2.5255 KOps/s | 2.4829 KOps/s | $\color{#35bf28}+1.72\\%$ | | test_empty[False] | 26.8220μs | 1.0414μs | 960.2482 KOps/s | 896.2196 KOps/s | $\textbf{\color{#35bf28}+7.14\\%}$ | | test_unbind_speed | 0.4550ms | 0.2437ms | 4.1029 KOps/s | 4.0438 KOps/s | $\color{#35bf28}+1.46\\%$ | | test_unbind_speed_stack0 | 0.4989ms | 0.2416ms | 4.1387 KOps/s | 4.0098 KOps/s | $\color{#35bf28}+3.22\\%$ | | test_unbind_speed_stack1 | 68.9810ms | 0.6990ms | 1.4306 KOps/s | 1.4069 KOps/s | $\color{#35bf28}+1.69\\%$ | | test_split | 69.7036ms | 1.5842ms | 631.2516 Ops/s | 613.3085 Ops/s | $\color{#35bf28}+2.93\\%$ | | test_chunk | 74.8310ms | 1.5958ms | 626.6275 Ops/s | 633.7496 Ops/s | $\color{#d91a1a}-1.12\\%$ | | test_creation[device0] | 0.2106ms | 84.3371μs | 11.8572 KOps/s | 11.6566 KOps/s | $\color{#35bf28}+1.72\\%$ | | test_creation_from_tensor | 4.3601ms | 84.8855μs | 11.7806 KOps/s | 11.5103 KOps/s | $\color{#35bf28}+2.35\\%$ | | test_add_one[memmap_tensor0] | 90.1590μs | 5.3330μs | 187.5119 KOps/s | 184.1848 KOps/s | $\color{#35bf28}+1.81\\%$ | | test_contiguous[memmap_tensor0] | 10.5500μs | 0.6257μs | 1.5982 MOps/s | 1.5314 MOps/s | $\color{#35bf28}+4.36\\%$ | | test_stack[memmap_tensor0] | 23.4440μs | 3.5978μs | 277.9483 KOps/s | 276.7384 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_memmaptd_index | 1.0193ms | 0.2595ms | 3.8530 KOps/s | 3.9011 KOps/s | $\color{#d91a1a}-1.23\\%$ | | test_memmaptd_index_astensor | 0.7272ms | 0.3323ms | 3.0095 KOps/s | 3.0511 KOps/s | $\color{#d91a1a}-1.36\\%$ | | test_memmaptd_index_op | 1.9891ms | 0.6541ms | 1.5289 KOps/s | 1.5705 KOps/s | $\color{#d91a1a}-2.65\\%$ | | test_serialize_model | 0.1638s | 0.1055s | 9.4784 Ops/s | 9.2376 Ops/s | $\color{#35bf28}+2.61\\%$ | | test_serialize_model_pickle | 0.4631s | 0.3799s | 2.6322 Ops/s | 2.6253 Ops/s | $\color{#35bf28}+0.26\\%$ | | test_serialize_weights | 0.1689s | 0.1030s | 9.7117 Ops/s | 9.5478 Ops/s | $\color{#35bf28}+1.72\\%$ | | test_serialize_weights_returnearly | 0.1286s | 0.1194s | 8.3770 Ops/s | 8.0524 Ops/s | $\color{#35bf28}+4.03\\%$ | | test_serialize_weights_pickle | 0.9912s | 0.5807s | 1.7221 Ops/s | 2.4465 Ops/s | $\textbf{\color{#d91a1a}-29.61\\%}$ | | test_serialize_weights_filesystem | 0.1601s | 96.4777ms | 10.3651 Ops/s | 9.7656 Ops/s | $\textbf{\color{#35bf28}+6.14\\%}$ | | test_serialize_model_filesystem | 0.1022s | 93.2461ms | 10.7243 Ops/s | 10.2944 Ops/s | $\color{#35bf28}+4.18\\%$ | | test_reshape_pytree | 66.3040μs | 25.5080μs | 39.2033 KOps/s | 39.1330 KOps/s | $\color{#35bf28}+0.18\\%$ | | test_reshape_td | 91.1710μs | 34.6162μs | 28.8882 KOps/s | 28.2958 KOps/s | $\color{#35bf28}+2.09\\%$ | | test_view_pytree | 83.1350μs | 25.8728μs | 38.6506 KOps/s | 39.5206 KOps/s | $\color{#d91a1a}-2.20\\%$ | | test_view_td | 76.1120μs | 39.5236μs | 25.3014 KOps/s | 24.9301 KOps/s | $\color{#35bf28}+1.49\\%$ | | test_unbind_pytree | 61.3950μs | 29.5998μs | 33.7840 KOps/s | 34.1545 KOps/s | $\color{#d91a1a}-1.08\\%$ | | test_unbind_td | 0.3614ms | 37.1584μs | 26.9118 KOps/s | 26.9653 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_split_pytree | 63.4590μs | 30.0277μs | 33.3026 KOps/s | 34.2618 KOps/s | $\color{#d91a1a}-2.80\\%$ | | test_split_td | 0.1215ms | 39.8362μs | 25.1028 KOps/s | 24.7639 KOps/s | $\color{#35bf28}+1.37\\%$ | | test_add_pytree | 98.6140μs | 35.1076μs | 28.4839 KOps/s | 28.4820 KOps/s | $+0.01\\%$ | | test_add_td | 0.1843ms | 53.8412μs | 18.5732 KOps/s | 17.1532 KOps/s | $\textbf{\color{#35bf28}+8.28\\%}$ | | test_distributed | 0.2537ms | 0.1002ms | 9.9756 KOps/s | 9.5142 KOps/s | $\color{#35bf28}+4.85\\%$ | | test_tdmodule | 71.4430μs | 18.3485μs | 54.5004 KOps/s | 53.5865 KOps/s | $\color{#35bf28}+1.71\\%$ | | test_tdmodule_dispatch | 58.1780μs | 36.1451μs | 27.6663 KOps/s | 26.6742 KOps/s | $\color{#35bf28}+3.72\\%$ | | test_tdseq | 44.2530μs | 21.2688μs | 47.0171 KOps/s | 45.1011 KOps/s | $\color{#35bf28}+4.25\\%$ | | test_tdseq_dispatch | 77.7850μs | 41.1176μs | 24.3205 KOps/s | 23.6373 KOps/s | $\color{#35bf28}+2.89\\%$ | | test_instantiation_functorch | 2.3377ms | 1.3526ms | 739.2942 Ops/s | 738.5745 Ops/s | $\color{#35bf28}+0.10\\%$ | | test_instantiation_td | 66.9990ms | 1.1000ms | 909.0920 Ops/s | 960.2242 Ops/s | $\textbf{\color{#d91a1a}-5.33\\%}$ | | test_exec_functorch | 0.2344ms | 0.1633ms | 6.1255 KOps/s | 6.0552 KOps/s | $\color{#35bf28}+1.16\\%$ | | test_exec_functional_call | 0.2207ms | 0.1496ms | 6.6833 KOps/s | 6.6826 KOps/s | $\color{#35bf28}+0.01\\%$ | | test_exec_td | 0.2249ms | 0.1482ms | 6.7487 KOps/s | 6.9591 KOps/s | $\color{#d91a1a}-3.02\\%$ | | test_exec_td_decorator | 0.9084ms | 0.2227ms | 4.4896 KOps/s | 4.4988 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_vmap_mlp_speed[True-True] | 0.6921ms | 0.4901ms | 2.0405 KOps/s | 2.0459 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_vmap_mlp_speed[True-False] | 0.8897ms | 0.4891ms | 2.0447 KOps/s | 2.0507 KOps/s | $\color{#d91a1a}-0.29\\%$ | | test_vmap_mlp_speed[False-True] | 0.6170ms | 0.3972ms | 2.5175 KOps/s | 2.5332 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_vmap_mlp_speed[False-False] | 0.6951ms | 0.3980ms | 2.5125 KOps/s | 2.5271 KOps/s | $\color{#d91a1a}-0.58\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.0930ms | 0.5624ms | 1.7781 KOps/s | 1.7736 KOps/s | $\color{#35bf28}+0.26\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.7645ms | 0.5576ms | 1.7935 KOps/s | 1.7806 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.7533ms | 0.4619ms | 2.1651 KOps/s | 2.1744 KOps/s | $\color{#d91a1a}-0.43\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8670ms | 0.4626ms | 2.1617 KOps/s | 2.1655 KOps/s | $\color{#d91a1a}-0.18\\%$ | | test_to_module_speed[True] | 2.5822ms | 1.7208ms | 581.1112 Ops/s | 587.1179 Ops/s | $\color{#d91a1a}-1.02\\%$ | | test_to_module_speed[False] | 2.6279ms | 1.7060ms | 586.1726 Ops/s | 600.3701 Ops/s | $\color{#d91a1a}-2.36\\%$ | | test_tc_init | 0.1287ms | 60.0342μs | 16.6572 KOps/s | 16.6397 KOps/s | $\color{#35bf28}+0.11\\%$ | | test_tc_init_nested | 0.2336ms | 0.1199ms | 8.3380 KOps/s | 8.8342 KOps/s | $\textbf{\color{#d91a1a}-5.62\\%}$ | | test_tc_first_layer_tensor | 24.7260μs | 8.4054μs | 118.9715 KOps/s | 120.5181 KOps/s | $\color{#d91a1a}-1.28\\%$ | | test_tc_first_layer_nontensor | 31.0980μs | 8.4087μs | 118.9249 KOps/s | 121.1239 KOps/s | $\color{#d91a1a}-1.82\\%$ | | test_tc_second_layer_tensor | 23.0930μs | 2.5496μs | 392.2120 KOps/s | 395.3542 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_tc_second_layer_nontensor | 30.0060μs | 9.3936μs | 106.4549 KOps/s | 108.0046 KOps/s | $\color{#d91a1a}-1.43\\%$ | | test_unbind | 81.3385ms | 13.7881ms | 72.5266 Ops/s | 67.6035 Ops/s | $\textbf{\color{#35bf28}+7.28\\%}$ | | test_full_like | 8.3844ms | 7.1992ms | 138.9050 Ops/s | 93.1946 Ops/s | $\textbf{\color{#35bf28}+49.05\\%}$ | | test_zeros_like | 13.8219ms | 5.9243ms | 168.7954 Ops/s | 164.1255 Ops/s | $\color{#35bf28}+2.85\\%$ | | test_ones_like | 16.0694ms | 6.4281ms | 155.5674 Ops/s | 157.9737 Ops/s | $\color{#d91a1a}-1.52\\%$ | | test_clone | 14.2716ms | 8.0666ms | 123.9686 Ops/s | 122.5354 Ops/s | $\color{#35bf28}+1.17\\%$ | | test_squeeze | 0.2761ms | 13.8450μs | 72.2280 KOps/s | 77.9391 KOps/s | $\textbf{\color{#d91a1a}-7.33\\%}$ | | test_unsqueeze | 0.2504ms | 95.5094μs | 10.4702 KOps/s | 10.1880 KOps/s | $\color{#35bf28}+2.77\\%$ | | test_split | 0.5166ms | 0.2734ms | 3.6571 KOps/s | 3.6217 KOps/s | $\color{#35bf28}+0.98\\%$ | | test_permute | 0.3702ms | 0.2242ms | 4.4603 KOps/s | 4.4240 KOps/s | $\color{#35bf28}+0.82\\%$ | | test_stack | 26.3968ms | 22.5324ms | 44.3805 Ops/s | 41.5287 Ops/s | $\textbf{\color{#35bf28}+6.87\\%}$ | | test_cat | 29.2164ms | 22.3303ms | 44.7821 Ops/s | 42.9115 Ops/s | $\color{#35bf28}+4.36\\%$ |
github-actions[bot]
commented
4 months ago
$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests
Total Benchmarks: 152. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}7$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | -------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 63.9910μs | 12.5910μs | 79.4221 KOps/s | 82.4656 KOps/s | $\color{#d91a1a}-3.69\\%$ | | test_plain_set_stack_nested | 26.1810μs | 12.6750μs | 78.8952 KOps/s | 81.6596 KOps/s | $\color{#d91a1a}-3.39\\%$ | | test_plain_set_nested_inplace | 37.7300μs | 14.0143μs | 71.3556 KOps/s | 74.1271 KOps/s | $\color{#d91a1a}-3.74\\%$ | | test_plain_set_stack_nested_inplace | 47.1910μs | 13.9185μs | 71.8468 KOps/s | 74.1853 KOps/s | $\color{#d91a1a}-3.15\\%$ | | test_items | 19.5400μs | 4.6530μs | 214.9154 KOps/s | 215.9325 KOps/s | $\color{#d91a1a}-0.47\\%$ | | test_items_nested | 0.3940ms | 0.3451ms | 2.8978 KOps/s | 2.9450 KOps/s | $\color{#d91a1a}-1.60\\%$ | | test_items_nested_locked | 0.4092ms | 0.3525ms | 2.8368 KOps/s | 2.9449 KOps/s | $\color{#d91a1a}-3.67\\%$ | | test_items_nested_leaf | 0.1025ms | 82.7080μs | 12.0907 KOps/s | 12.1252 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_items_stack_nested | 0.4099ms | 0.3449ms | 2.8995 KOps/s | 2.9241 KOps/s | $\color{#d91a1a}-0.84\\%$ | | test_items_stack_nested_leaf | 0.1048ms | 83.8444μs | 11.9269 KOps/s | 12.0234 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_items_stack_nested_locked | 0.3917ms | 0.3489ms | 2.8664 KOps/s | 2.8895 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_keys | 30.2010μs | 4.3455μs | 230.1227 KOps/s | 230.7210 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_keys_nested | 0.1015ms | 68.7072μs | 14.5545 KOps/s | 14.9311 KOps/s | $\color{#d91a1a}-2.52\\%$ | | test_keys_nested_locked | 2.3367ms | 75.0424μs | 13.3258 KOps/s | 13.2566 KOps/s | $\color{#35bf28}+0.52\\%$ | | test_keys_nested_leaf | 81.8620μs | 57.5921μs | 17.3635 KOps/s | 17.2624 KOps/s | $\color{#35bf28}+0.59\\%$ | | test_keys_stack_nested | 93.5820μs | 66.6426μs | 15.0054 KOps/s | 14.5094 KOps/s | $\color{#35bf28}+3.42\\%$ | | test_keys_stack_nested_leaf | 86.1520μs | 59.1943μs | 16.8935 KOps/s | 17.3240 KOps/s | $\color{#d91a1a}-2.49\\%$ | | test_keys_stack_nested_locked | 0.1047ms | 73.9630μs | 13.5203 KOps/s | 13.3741 KOps/s | $\color{#35bf28}+1.09\\%$ | | test_values | 7.9567μs | 1.7974μs | 556.3583 KOps/s | 550.8175 KOps/s | $\color{#35bf28}+1.01\\%$ | | test_values_nested | 59.6910μs | 35.1817μs | 28.4239 KOps/s | 28.5698 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_values_nested_locked | 65.6520μs | 37.2050μs | 26.8781 KOps/s | 27.3173 KOps/s | $\color{#d91a1a}-1.61\\%$ | | test_values_nested_leaf | 50.6810μs | 31.0936μs | 32.1610 KOps/s | 32.1442 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_values_stack_nested | 52.8810μs | 35.2901μs | 28.3366 KOps/s | 27.9001 KOps/s | $\color{#35bf28}+1.56\\%$ | | test_values_stack_nested_leaf | 58.8520μs | 31.2213μs | 32.0295 KOps/s | 31.0942 KOps/s | $\color{#35bf28}+3.01\\%$ | | test_values_stack_nested_locked | 59.2210μs | 36.8471μs | 27.1392 KOps/s | 26.9305 KOps/s | $\color{#35bf28}+0.78\\%$ | | test_membership | 1.6830μs | 0.7077μs | 1.4130 MOps/s | 1.4404 MOps/s | $\color{#d91a1a}-1.90\\%$ | | test_membership_nested | 17.7700μs | 2.5401μs | 393.6844 KOps/s | 396.0719 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_membership_nested_leaf | 28.6500μs | 2.5313μs | 395.0545 KOps/s | 391.8683 KOps/s | $\color{#35bf28}+0.81\\%$ | | test_membership_stacked_nested | 25.8500μs | 2.5447μs | 392.9786 KOps/s | 395.6302 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_membership_stacked_nested_leaf | 33.7110μs | 2.5339μs | 394.6463 KOps/s | 396.4680 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_membership_nested_last | 18.3110μs | 3.0396μs | 328.9891 KOps/s | 329.8977 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_membership_nested_leaf_last | 32.0110μs | 3.0422μs | 328.7124 KOps/s | 327.8787 KOps/s | $\color{#35bf28}+0.25\\%$ | | test_membership_stacked_nested_last | 27.2510μs | 3.0817μs | 324.5015 KOps/s | 264.3870 KOps/s | $\textbf{\color{#35bf28}+22.74\\%}$ | | test_membership_stacked_nested_leaf_last | 21.9710μs | 3.0295μs | 330.0843 KOps/s | 264.1538 KOps/s | $\textbf{\color{#35bf28}+24.96\\%}$ | | test_nested_getleaf | 37.7610μs | 8.3291μs | 120.0607 KOps/s | 119.9511 KOps/s | $\color{#35bf28}+0.09\\%$ | | test_nested_get | 31.6110μs | 7.7984μs | 128.2314 KOps/s | 128.2762 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_stacked_getleaf | 25.8410μs | 8.3154μs | 120.2581 KOps/s | 119.2676 KOps/s | $\color{#35bf28}+0.83\\%$ | | test_stacked_get | 40.4110μs | 7.8007μs | 128.1936 KOps/s | 128.2982 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_nested_getitemleaf | 24.8910μs | 8.4888μs | 117.8020 KOps/s | 117.8398 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_nested_getitem | 35.0010μs | 7.9831μs | 125.2640 KOps/s | 122.8762 KOps/s | $\color{#35bf28}+1.94\\%$ | | test_stacked_getitemleaf | 35.1610μs | 8.4482μs | 118.3681 KOps/s | 116.9447 KOps/s | $\color{#35bf28}+1.22\\%$ | | test_stacked_getitem | 88.2120μs | 8.0451μs | 124.2999 KOps/s | 125.1604 KOps/s | $\color{#d91a1a}-0.69\\%$ | | test_lock_nested | 59.1127ms | 0.3970ms | 2.5187 KOps/s | 2.4350 KOps/s | $\color{#35bf28}+3.44\\%$ | | test_lock_stack_nested | 0.3392ms | 0.2918ms | 3.4268 KOps/s | 3.2645 KOps/s | $\color{#35bf28}+4.97\\%$ | | test_unlock_nested | 61.2117ms | 0.3983ms | 2.5108 KOps/s | 2.4473 KOps/s | $\color{#35bf28}+2.59\\%$ | | test_unlock_stack_nested | 0.3541ms | 0.3007ms | 3.3257 KOps/s | 3.1983 KOps/s | $\color{#35bf28}+3.98\\%$ | | test_flatten_speed | 0.4185ms | 0.1011ms | 9.8896 KOps/s | 9.8063 KOps/s | $\color{#35bf28}+0.85\\%$ | | test_unflatten_speed | 0.3286ms | 0.2884ms | 3.4680 KOps/s | 3.4614 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_common_ops | 1.0769ms | 0.5819ms | 1.7184 KOps/s | 1.7053 KOps/s | $\color{#35bf28}+0.77\\%$ | | test_creation | 37.8400μs | 1.6033μs | 623.7050 KOps/s | 622.8989 KOps/s | $\color{#35bf28}+0.13\\%$ | | test_creation_empty | 26.6510μs | 8.0037μs | 124.9428 KOps/s | 135.4387 KOps/s | $\textbf{\color{#d91a1a}-7.75\\%}$ | | test_creation_nested_1 | 24.3510μs | 9.6641μs | 103.4755 KOps/s | 110.0819 KOps/s | $\textbf{\color{#d91a1a}-6.00\\%}$ | | test_creation_nested_2 | 36.7310μs | 11.9565μs | 83.6364 KOps/s | 87.2567 KOps/s | $\color{#d91a1a}-4.15\\%$ | | test_clone | 85.8820μs | 11.7454μs | 85.1396 KOps/s | 82.9679 KOps/s | $\color{#35bf28}+2.62\\%$ | | test_getitem[int] | 24.9500μs | 10.4967μs | 95.2680 KOps/s | 90.7962 KOps/s | $\color{#35bf28}+4.93\\%$ | | test_getitem[slice_int] | 40.0910μs | 20.2660μs | 49.3436 KOps/s | 46.0215 KOps/s | $\textbf{\color{#35bf28}+7.22\\%}$ | | test_getitem[range] | 65.9110μs | 51.5730μs | 19.3900 KOps/s | 19.8820 KOps/s | $\color{#d91a1a}-2.47\\%$ | | test_getitem[tuple] | 42.1110μs | 18.4509μs | 54.1979 KOps/s | 51.3237 KOps/s | $\textbf{\color{#35bf28}+5.60\\%}$ | | test_getitem[list] | 0.1408ms | 33.6312μs | 29.7343 KOps/s | 27.9918 KOps/s | $\textbf{\color{#35bf28}+6.23\\%}$ | | test_setitem_dim[int] | 42.9310μs | 25.0589μs | 39.9059 KOps/s | 36.7055 KOps/s | $\textbf{\color{#35bf28}+8.72\\%}$ | | test_setitem_dim[slice_int] | 83.0010μs | 49.4597μs | 20.2185 KOps/s | 20.2906 KOps/s | $\color{#d91a1a}-0.36\\%$ | | test_setitem_dim[range] | 0.1085ms | 67.2715μs | 14.8651 KOps/s | 14.8827 KOps/s | $\color{#d91a1a}-0.12\\%$ | | test_setitem_dim[tuple] | 68.8320μs | 42.9492μs | 23.2833 KOps/s | 23.9208 KOps/s | $\color{#d91a1a}-2.66\\%$ | | test_setitem | 41.7010μs | 16.1810μs | 61.8009 KOps/s | 60.9318 KOps/s | $\color{#35bf28}+1.43\\%$ | | test_set | 54.6310μs | 15.3862μs | 64.9933 KOps/s | 63.9561 KOps/s | $\color{#35bf28}+1.62\\%$ | | test_set_shared | 1.6114ms | 98.3614μs | 10.1666 KOps/s | 9.9638 KOps/s | $\color{#35bf28}+2.04\\%$ | | test_update | 88.0020μs | 18.4172μs | 54.2970 KOps/s | 55.9382 KOps/s | $\color{#d91a1a}-2.93\\%$ | | test_update_nested | 73.7620μs | 23.9920μs | 41.6806 KOps/s | 43.8305 KOps/s | $\color{#d91a1a}-4.91\\%$ | | test_update__nested | 48.7510μs | 22.5376μs | 44.3703 KOps/s | 44.1766 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_set_nested | 58.7420μs | 16.6052μs | 60.2221 KOps/s | 59.7460 KOps/s | $\color{#35bf28}+0.80\\%$ | | test_set_nested_new | 55.0610μs | 19.2031μs | 52.0750 KOps/s | 51.8730 KOps/s | $\color{#35bf28}+0.39\\%$ | | test_select | 75.7120μs | 32.7677μs | 30.5179 KOps/s | 32.2559 KOps/s | $\textbf{\color{#d91a1a}-5.39\\%}$ | | test_select_nested | 94.3520μs | 51.3518μs | 19.4735 KOps/s | 19.1715 KOps/s | $\color{#35bf28}+1.58\\%$ | | test_exclude_nested | 0.1400ms | 0.1063ms | 9.4047 KOps/s | 9.4639 KOps/s | $\color{#d91a1a}-0.63\\%$ | | test_empty[True] | 0.4032ms | 0.3414ms | 2.9292 KOps/s | 2.9336 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_empty[False] | 2.8071μs | 0.8034μs | 1.2447 MOps/s | 1.2281 MOps/s | $\color{#35bf28}+1.36\\%$ | | test_to | 87.0220μs | 59.5436μs | 16.7944 KOps/s | 15.6418 KOps/s | $\textbf{\color{#35bf28}+7.37\\%}$ | | test_to_nonblocking | 54.5310μs | 35.9932μs | 27.7830 KOps/s | 26.8919 KOps/s | $\color{#35bf28}+3.31\\%$ | | test_unbind_speed | 1.5628ms | 0.2577ms | 3.8803 KOps/s | 3.7134 KOps/s | $\color{#35bf28}+4.50\\%$ | | test_unbind_speed_stack0 | 0.3006ms | 0.2572ms | 3.8883 KOps/s | 3.7401 KOps/s | $\color{#35bf28}+3.96\\%$ | | test_unbind_speed_stack1 | 75.9604ms | 0.7757ms | 1.2892 KOps/s | 1.2673 KOps/s | $\color{#35bf28}+1.73\\%$ | | test_split | 76.2275ms | 1.6752ms | 596.9602 Ops/s | 557.5711 Ops/s | $\textbf{\color{#35bf28}+7.06\\%}$ | | test_chunk | 76.4499ms | 1.6754ms | 596.8772 Ops/s | 560.8092 Ops/s | $\textbf{\color{#35bf28}+6.43\\%}$ | | test_creation[device0] | 0.1512ms | 57.6869μs | 17.3350 KOps/s | 17.1260 KOps/s | $\color{#35bf28}+1.22\\%$ | | test_creation_from_tensor | 0.1402ms | 54.3924μs | 18.3849 KOps/s | 18.4538 KOps/s | $\color{#d91a1a}-0.37\\%$ | | test_add_one[memmap_tensor0] | 88.0620μs | 7.1799μs | 139.2784 KOps/s | 131.8245 KOps/s | $\textbf{\color{#35bf28}+5.65\\%}$ | | test_contiguous[memmap_tensor0] | 9.8000μs | 0.6698μs | 1.4929 MOps/s | 1.4958 MOps/s | $\color{#d91a1a}-0.19\\%$ | | test_stack[memmap_tensor0] | 37.9210μs | 4.8591μs | 205.8006 KOps/s | 186.7391 KOps/s | $\textbf{\color{#35bf28}+10.21\\%}$ | | test_memmaptd_index | 1.0702ms | 0.2742ms | 3.6470 KOps/s | 3.4516 KOps/s | $\textbf{\color{#35bf28}+5.66\\%}$ | | test_memmaptd_index_astensor | 0.5895ms | 0.3346ms | 2.9885 KOps/s | 2.8729 KOps/s | $\color{#35bf28}+4.02\\%$ | | test_memmaptd_index_op | 0.9415ms | 0.6384ms | 1.5665 KOps/s | 1.5346 KOps/s | $\color{#35bf28}+2.08\\%$ | | test_serialize_model | 91.7647ms | 89.6288ms | 11.1571 Ops/s | 10.3041 Ops/s | $\textbf{\color{#35bf28}+8.28\\%}$ | | test_serialize_model_pickle | 1.3482s | 1.2352s | 0.8096 Ops/s | 0.8088 Ops/s | $\color{#35bf28}+0.10\\%$ | | test_serialize_weights | 92.8752ms | 88.7525ms | 11.2673 Ops/s | 9.6683 Ops/s | $\textbf{\color{#35bf28}+16.54\\%}$ | | test_serialize_weights_returnearly | 0.2577s | 77.5536ms | 12.8943 Ops/s | 13.5190 Ops/s | $\color{#d91a1a}-4.62\\%$ | | test_serialize_weights_pickle | 1.3492s | 1.2362s | 0.8089 Ops/s | 0.8032 Ops/s | $\color{#35bf28}+0.71\\%$ | | test_reshape_pytree | 87.2320μs | 25.8396μs | 38.7003 KOps/s | 38.1500 KOps/s | $\color{#35bf28}+1.44\\%$ | | test_reshape_td | 70.0420μs | 31.6892μs | 31.5564 KOps/s | 31.8002 KOps/s | $\color{#d91a1a}-0.77\\%$ | | test_view_pytree | 48.5620μs | 25.9540μs | 38.5297 KOps/s | 38.0707 KOps/s | $\color{#35bf28}+1.21\\%$ | | test_view_td | 0.1019ms | 35.9725μs | 27.7990 KOps/s | 27.8874 KOps/s | $\color{#d91a1a}-0.32\\%$ | | test_unbind_pytree | 53.7210μs | 32.0993μs | 31.1533 KOps/s | 29.1284 KOps/s | $\textbf{\color{#35bf28}+6.95\\%}$ | | test_unbind_td | 0.4150ms | 39.0687μs | 25.5960 KOps/s | 23.5477 KOps/s | $\textbf{\color{#35bf28}+8.70\\%}$ | | test_split_pytree | 69.4120μs | 35.7561μs | 27.9672 KOps/s | 27.5663 KOps/s | $\color{#35bf28}+1.45\\%$ | | test_split_td | 0.1072ms | 39.2368μs | 25.4862 KOps/s | 24.2427 KOps/s | $\textbf{\color{#35bf28}+5.13\\%}$ | | test_add_pytree | 63.5820μs | 37.8563μs | 26.4157 KOps/s | 25.1617 KOps/s | $\color{#35bf28}+4.98\\%$ | | test_add_td | 0.2103ms | 50.0275μs | 19.9890 KOps/s | 19.8271 KOps/s | $\color{#35bf28}+0.82\\%$ | | test_distributed | 0.2363ms | 72.4840μs | 13.7962 KOps/s | 14.4697 KOps/s | $\color{#d91a1a}-4.65\\%$ | | test_tdmodule | 0.1140ms | 15.7156μs | 63.6310 KOps/s | 69.2747 KOps/s | $\textbf{\color{#d91a1a}-8.15\\%}$ | | test_tdmodule_dispatch | 0.1694ms | 30.4384μs | 32.8532 KOps/s | 36.1721 KOps/s | $\textbf{\color{#d91a1a}-9.18\\%}$ | | test_tdseq | 37.5710μs | 16.6290μs | 60.1359 KOps/s | 63.2547 KOps/s | $\color{#d91a1a}-4.93\\%$ | | test_tdseq_dispatch | 47.6010μs | 31.5890μs | 31.6566 KOps/s | 32.4594 KOps/s | $\color{#d91a1a}-2.47\\%$ | | test_instantiation_functorch | 1.4993ms | 1.4059ms | 711.2989 Ops/s | 692.9957 Ops/s | $\color{#35bf28}+2.64\\%$ | | test_instantiation_td | 1.4740ms | 0.9779ms | 1.0226 KOps/s | 913.7464 Ops/s | $\textbf{\color{#35bf28}+11.91\\%}$ | | test_exec_functorch | 0.2112ms | 0.1467ms | 6.8184 KOps/s | 6.5408 KOps/s | $\color{#35bf28}+4.24\\%$ | | test_exec_functional_call | 0.1913ms | 0.1370ms | 7.2983 KOps/s | 6.8573 KOps/s | $\textbf{\color{#35bf28}+6.43\\%}$ | | test_exec_td | 0.1703ms | 0.1363ms | 7.3384 KOps/s | 6.8525 KOps/s | $\textbf{\color{#35bf28}+7.09\\%}$ | | test_exec_td_decorator | 0.7009ms | 0.2081ms | 4.8055 KOps/s | 4.7070 KOps/s | $\color{#35bf28}+2.09\\%$ | | test_vmap_mlp_speed[True-True] | 0.6471ms | 0.5795ms | 1.7258 KOps/s | 1.7174 KOps/s | $\color{#35bf28}+0.49\\%$ | | test_vmap_mlp_speed[True-False] | 0.6618ms | 0.5788ms | 1.7276 KOps/s | 1.6606 KOps/s | $\color{#35bf28}+4.03\\%$ | | test_vmap_mlp_speed[False-True] | 0.5809ms | 0.5092ms | 1.9637 KOps/s | 1.9305 KOps/s | $\color{#35bf28}+1.72\\%$ | | test_vmap_mlp_speed[False-False] | 0.5717ms | 0.5108ms | 1.9579 KOps/s | 1.9157 KOps/s | $\color{#35bf28}+2.20\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 0.7964ms | 0.6386ms | 1.5659 KOps/s | 1.5503 KOps/s | $\color{#35bf28}+1.00\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.8239ms | 0.6398ms | 1.5631 KOps/s | 1.5470 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8093ms | 0.5902ms | 1.6944 KOps/s | 1.7490 KOps/s | $\color{#d91a1a}-3.12\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8403ms | 0.5810ms | 1.7213 KOps/s | 1.7562 KOps/s | $\color{#d91a1a}-1.99\\%$ | | test_vmap_transformer_speed[True-True] | 8.1964ms | 7.8179ms | 127.9113 Ops/s | 127.6381 Ops/s | $\color{#35bf28}+0.21\\%$ | | test_vmap_transformer_speed[True-False] | 8.7152ms | 7.7935ms | 128.3117 Ops/s | 126.8612 Ops/s | $\color{#35bf28}+1.14\\%$ | | test_vmap_transformer_speed[False-True] | 8.1202ms | 7.7260ms | 129.4327 Ops/s | 128.7429 Ops/s | $\color{#35bf28}+0.54\\%$ | | test_vmap_transformer_speed[False-False] | 8.9928ms | 7.7324ms | 129.3268 Ops/s | 128.9040 Ops/s | $\color{#35bf28}+0.33\\%$ | | test_vmap_transformer_speed_decorator[True-True] | 19.3120ms | 18.9401ms | 52.7980 Ops/s | 53.1432 Ops/s | $\color{#d91a1a}-0.65\\%$ | | test_vmap_transformer_speed_decorator[True-False] | 19.4524ms | 18.9806ms | 52.6854 Ops/s | 52.8034 Ops/s | $\color{#d91a1a}-0.22\\%$ | | test_vmap_transformer_speed_decorator[False-True] | 19.3481ms | 18.8775ms | 52.9732 Ops/s | 53.2843 Ops/s | $\color{#d91a1a}-0.58\\%$ | | test_vmap_transformer_speed_decorator[False-False] | 19.2638ms | 18.8418ms | 53.0734 Ops/s | 53.5599 Ops/s | $\color{#d91a1a}-0.91\\%$ | | test_to_module_speed[True] | 2.7702ms | 1.5176ms | 658.9489 Ops/s | 673.4655 Ops/s | $\color{#d91a1a}-2.16\\%$ | | test_to_module_speed[False] | 2.0207ms | 1.4998ms | 666.7486 Ops/s | 676.9939 Ops/s | $\color{#d91a1a}-1.51\\%$ | | test_tc_init | 0.1838ms | 54.0653μs | 18.4962 KOps/s | 21.2928 KOps/s | $\textbf{\color{#d91a1a}-13.13\\%}$ | | test_tc_init_nested | 0.2637ms | 0.1064ms | 9.4004 KOps/s | 10.2279 KOps/s | $\textbf{\color{#d91a1a}-8.09\\%}$ | | test_tc_first_layer_tensor | 0.1159ms | 3.7535μs | 266.4171 KOps/s | 266.4222 KOps/s | $-0.00\\%$ | | test_tc_first_layer_nontensor | 0.1165ms | 3.7644μs | 265.6486 KOps/s | 267.2334 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_tc_second_layer_tensor | 0.1172ms | 1.2693μs | 787.8125 KOps/s | 791.8908 KOps/s | $\color{#d91a1a}-0.52\\%$ | | test_tc_second_layer_nontensor | 54.1510μs | 4.2415μs | 235.7665 KOps/s | 234.6518 KOps/s | $\color{#35bf28}+0.48\\%$ | | test_unbind | 0.1141s | 13.2507ms | 75.4679 Ops/s | 65.5827 Ops/s | $\textbf{\color{#35bf28}+15.07\\%}$ | | test_full_like | 9.7365ms | 9.3034ms | 107.4881 Ops/s | 73.6714 Ops/s | $\textbf{\color{#35bf28}+45.90\\%}$ | | test_zeros_like | 8.5437ms | 7.9824ms | 125.2763 Ops/s | 125.6859 Ops/s | $\color{#d91a1a}-0.33\\%$ | | test_ones_like | 8.4956ms | 8.0468ms | 124.2724 Ops/s | 123.4873 Ops/s | $\color{#35bf28}+0.64\\%$ | | test_clone | 9.7952ms | 9.4761ms | 105.5287 Ops/s | 105.6828 Ops/s | $\color{#d91a1a}-0.15\\%$ | | test_squeeze | 80.7910μs | 10.7706μs | 92.8456 KOps/s | 94.9560 KOps/s | $\color{#d91a1a}-2.22\\%$ | | test_unsqueeze | 0.2220ms | 87.9864μs | 11.3654 KOps/s | 11.2724 KOps/s | $\color{#35bf28}+0.83\\%$ | | test_split | 3.4376ms | 3.1278ms | 319.7161 Ops/s | 320.3583 Ops/s | $\color{#d91a1a}-0.20\\%$ | | test_permute | 0.2960ms | 0.2034ms | 4.9155 KOps/s | 4.8896 KOps/s | $\color{#35bf28}+0.53\\%$ | | test_stack | 27.8858ms | 26.9838ms | 37.0593 Ops/s | 36.6527 Ops/s | $\color{#35bf28}+1.11\\%$ | | test_cat | 26.7883ms | 26.6322ms | 37.5485 Ops/s | 37.1157 Ops/s | $\color{#35bf28}+1.17\\%$ |
$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests
Total Benchmarks: 144. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}8$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------ | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 38.9720μs | 17.0447μs | 58.6692 KOps/s | 58.1698 KOps/s | $\color{#35bf28}+0.86\\%$ | | test_plain_set_stack_nested | 41.1170μs | 17.4423μs | 57.3320 KOps/s | 57.3406 KOps/s | $\color{#d91a1a}-0.01\\%$ | | test_plain_set_nested_inplace | 54.5520μs | 19.4819μs | 51.3296 KOps/s | 47.4723 KOps/s | $\textbf{\color{#35bf28}+8.13\\%}$ | | test_plain_set_stack_nested_inplace | 51.2560μs | 19.6777μs | 50.8191 KOps/s | 50.9982 KOps/s | $\color{#d91a1a}-0.35\\%$ | | test_items | 25.4180μs | 2.5610μs | 390.4676 KOps/s | 383.4313 KOps/s | $\color{#35bf28}+1.84\\%$ | | test_items_nested | 0.7706ms | 0.2909ms | 3.4380 KOps/s | 3.6096 KOps/s | $\color{#d91a1a}-4.75\\%$ | | test_items_nested_locked | 0.7973ms | 0.2898ms | 3.4505 KOps/s | 3.6189 KOps/s | $\color{#d91a1a}-4.65\\%$ | | test_items_nested_leaf | 0.1401ms | 79.9657μs | 12.5054 KOps/s | 12.3612 KOps/s | $\color{#35bf28}+1.17\\%$ | | test_items_stack_nested | 0.4340ms | 0.2906ms | 3.4414 KOps/s | 3.5347 KOps/s | $\color{#d91a1a}-2.64\\%$ | | test_items_stack_nested_leaf | 0.1278ms | 79.7727μs | 12.5356 KOps/s | 12.4673 KOps/s | $\color{#35bf28}+0.55\\%$ | | test_items_stack_nested_locked | 0.9620ms | 0.2862ms | 3.4938 KOps/s | 3.6324 KOps/s | $\color{#d91a1a}-3.82\\%$ | | test_keys | 39.0630μs | 3.8193μs | 261.8296 KOps/s | 263.5583 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_keys_nested | 0.2419ms | 0.1397ms | 7.1557 KOps/s | 7.1759 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_keys_nested_locked | 0.7762ms | 0.1442ms | 6.9363 KOps/s | 6.9062 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_keys_nested_leaf | 0.2004ms | 0.1186ms | 8.4286 KOps/s | 8.4676 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_keys_stack_nested | 0.6427ms | 0.1437ms | 6.9601 KOps/s | 7.1835 KOps/s | $\color{#d91a1a}-3.11\\%$ | | test_keys_stack_nested_leaf | 0.2069ms | 0.1160ms | 8.6224 KOps/s | 8.4660 KOps/s | $\color{#35bf28}+1.85\\%$ | | test_keys_stack_nested_locked | 0.2582ms | 0.1415ms | 7.0677 KOps/s | 7.0006 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_values | 6.9570μs | 1.1635μs | 859.4545 KOps/s | 860.7141 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_values_nested | 0.1013ms | 53.2745μs | 18.7707 KOps/s | 19.7069 KOps/s | $\color{#d91a1a}-4.75\\%$ | | test_values_nested_locked | 0.3009ms | 52.1465μs | 19.1767 KOps/s | 18.9451 KOps/s | $\color{#35bf28}+1.22\\%$ | | test_values_nested_leaf | 87.3430μs | 47.7826μs | 20.9281 KOps/s | 21.8136 KOps/s | $\color{#d91a1a}-4.06\\%$ | | test_values_stack_nested | 0.1966ms | 53.1679μs | 18.8083 KOps/s | 19.4933 KOps/s | $\color{#d91a1a}-3.51\\%$ | | test_values_stack_nested_leaf | 91.8910μs | 47.6639μs | 20.9802 KOps/s | 21.8074 KOps/s | $\color{#d91a1a}-3.79\\%$ | | test_values_stack_nested_locked | 0.1368ms | 53.6921μs | 18.6247 KOps/s | 19.2872 KOps/s | $\color{#d91a1a}-3.43\\%$ | | test_membership | 14.2270μs | 1.3736μs | 727.9892 KOps/s | 730.1718 KOps/s | $\color{#d91a1a}-0.30\\%$ | | test_membership_nested | 38.1010μs | 3.4091μs | 293.3325 KOps/s | 282.0785 KOps/s | $\color{#35bf28}+3.99\\%$ | | test_membership_nested_leaf | 52.9680μs | 3.4455μs | 290.2361 KOps/s | 284.0741 KOps/s | $\color{#35bf28}+2.17\\%$ | | test_membership_stacked_nested | 19.2760μs | 3.4068μs | 293.5293 KOps/s | 268.8203 KOps/s | $\textbf{\color{#35bf28}+9.19\\%}$ | | test_membership_stacked_nested_leaf | 20.3380μs | 3.4332μs | 291.2693 KOps/s | 285.6686 KOps/s | $\color{#35bf28}+1.96\\%$ | | test_membership_nested_last | 23.6640μs | 4.1864μs | 238.8690 KOps/s | 238.9223 KOps/s | $\color{#d91a1a}-0.02\\%$ | | test_membership_nested_leaf_last | 33.1920μs | 4.2033μs | 237.9065 KOps/s | 238.5266 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_membership_stacked_nested_last | 26.8500μs | 5.2906μs | 189.0153 KOps/s | 237.5619 KOps/s | $\textbf{\color{#d91a1a}-20.44\\%}$ | | test_membership_stacked_nested_leaf_last | 30.0970μs | 5.3702μs | 186.2130 KOps/s | 233.3752 KOps/s | $\textbf{\color{#d91a1a}-20.21\\%}$ | | test_nested_getleaf | 34.5450μs | 10.9552μs | 91.2813 KOps/s | 94.1394 KOps/s | $\color{#d91a1a}-3.04\\%$ | | test_nested_get | 36.6590μs | 10.4468μs | 95.7234 KOps/s | 98.5885 KOps/s | $\color{#d91a1a}-2.91\\%$ | | test_stacked_getleaf | 29.9650μs | 10.8043μs | 92.5556 KOps/s | 93.3781 KOps/s | $\color{#d91a1a}-0.88\\%$ | | test_stacked_get | 41.5580μs | 10.2327μs | 97.7255 KOps/s | 98.6858 KOps/s | $\color{#d91a1a}-0.97\\%$ | | test_nested_getitemleaf | 33.3530μs | 11.5929μs | 86.2594 KOps/s | 88.5812 KOps/s | $\color{#d91a1a}-2.62\\%$ | | test_nested_getitem | 44.2530μs | 10.9157μs | 91.6112 KOps/s | 95.7400 KOps/s | $\color{#d91a1a}-4.31\\%$ | | test_stacked_getitemleaf | 34.5650μs | 11.5044μs | 86.9235 KOps/s | 88.7281 KOps/s | $\color{#d91a1a}-2.03\\%$ | | test_stacked_getitem | 31.5390μs | 10.5677μs | 94.6281 KOps/s | 95.8396 KOps/s | $\color{#d91a1a}-1.26\\%$ | | test_lock_nested | 52.0958ms | 0.3868ms | 2.5853 KOps/s | 3.0003 KOps/s | $\textbf{\color{#d91a1a}-13.83\\%}$ | | test_lock_stack_nested | 0.5746ms | 0.2991ms | 3.3436 KOps/s | 3.3191 KOps/s | $\color{#35bf28}+0.74\\%$ | | test_unlock_nested | 0.7785ms | 0.3336ms | 2.9972 KOps/s | 2.9421 KOps/s | $\color{#35bf28}+1.87\\%$ | | test_unlock_stack_nested | 0.4795ms | 0.3054ms | 3.2741 KOps/s | 3.2022 KOps/s | $\color{#35bf28}+2.25\\%$ | | test_flatten_speed | 0.2211ms | 0.1004ms | 9.9588 KOps/s | 9.9914 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_unflatten_speed | 0.7384ms | 0.4194ms | 2.3842 KOps/s | 2.4100 KOps/s | $\color{#d91a1a}-1.07\\%$ | | test_common_ops | 1.3975ms | 0.7216ms | 1.3859 KOps/s | 1.3313 KOps/s | $\color{#35bf28}+4.10\\%$ | | test_creation | 19.4770μs | 1.9099μs | 523.5746 KOps/s | 514.9879 KOps/s | $\color{#35bf28}+1.67\\%$ | | test_creation_empty | 35.1660μs | 10.8906μs | 91.8225 KOps/s | 83.3974 KOps/s | $\textbf{\color{#35bf28}+10.10\\%}$ | | test_creation_nested_1 | 40.5760μs | 13.5898μs | 73.5845 KOps/s | 67.1775 KOps/s | $\textbf{\color{#35bf28}+9.54\\%}$ | | test_creation_nested_2 | 71.0330μs | 17.2119μs | 58.0992 KOps/s | 54.8957 KOps/s | $\textbf{\color{#35bf28}+5.84\\%}$ | | test_clone | 71.4330μs | 13.2412μs | 75.5217 KOps/s | 76.0746 KOps/s | $\color{#d91a1a}-0.73\\%$ | | test_getitem[int] | 35.1260μs | 11.0271μs | 90.6854 KOps/s | 89.9607 KOps/s | $\color{#35bf28}+0.81\\%$ | | test_getitem[slice_int] | 70.1810μs | 22.0579μs | 45.3353 KOps/s | 44.7504 KOps/s | $\color{#35bf28}+1.31\\%$ | | test_getitem[range] | 74.3690μs | 58.5854μs | 17.0691 KOps/s | 16.7243 KOps/s | $\color{#35bf28}+2.06\\%$ | | test_getitem[tuple] | 54.8620μs | 18.2615μs | 54.7599 KOps/s | 54.9844 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_getitem[list] | 0.1024ms | 39.3588μs | 25.4073 KOps/s | 25.6704 KOps/s | $\color{#d91a1a}-1.02\\%$ | | test_setitem_dim[int] | 62.3560μs | 32.8905μs | 30.4039 KOps/s | 29.4046 KOps/s | $\color{#35bf28}+3.40\\%$ | | test_setitem_dim[slice_int] | 99.7960μs | 59.0009μs | 16.9489 KOps/s | 16.8951 KOps/s | $\color{#35bf28}+0.32\\%$ | | test_setitem_dim[range] | 0.1707ms | 82.1899μs | 12.1669 KOps/s | 11.8298 KOps/s | $\color{#35bf28}+2.85\\%$ | | test_setitem_dim[tuple] | 0.1000ms | 48.5660μs | 20.5905 KOps/s | 20.1249 KOps/s | $\color{#35bf28}+2.31\\%$ | | test_setitem | 57.5170μs | 19.8832μs | 50.2937 KOps/s | 48.1999 KOps/s | $\color{#35bf28}+4.34\\%$ | | test_set | 71.8240μs | 19.7216μs | 50.7059 KOps/s | 49.5931 KOps/s | $\color{#35bf28}+2.24\\%$ | | test_set_shared | 3.4628ms | 0.1429ms | 6.9966 KOps/s | 6.7196 KOps/s | $\color{#35bf28}+4.12\\%$ | | test_update | 0.1400ms | 22.4098μs | 44.6233 KOps/s | 42.4553 KOps/s | $\textbf{\color{#35bf28}+5.11\\%}$ | | test_update_nested | 0.1338ms | 30.7450μs | 32.5256 KOps/s | 31.2497 KOps/s | $\color{#35bf28}+4.08\\%$ | | test_update__nested | 0.2606ms | 26.4711μs | 37.7770 KOps/s | 40.1777 KOps/s | $\textbf{\color{#d91a1a}-5.98\\%}$ | | test_set_nested | 69.2490μs | 21.2201μs | 47.1251 KOps/s | 44.7199 KOps/s | $\textbf{\color{#35bf28}+5.38\\%}$ | | test_set_nested_new | 62.1560μs | 25.5187μs | 39.1869 KOps/s | 37.3989 KOps/s | $\color{#35bf28}+4.78\\%$ | | test_select | 0.1062ms | 40.3864μs | 24.7608 KOps/s | 23.6177 KOps/s | $\color{#35bf28}+4.84\\%$ | | test_select_nested | 0.1129ms | 57.7906μs | 17.3038 KOps/s | 16.8453 KOps/s | $\color{#35bf28}+2.72\\%$ | | test_exclude_nested | 0.2177ms | 0.1196ms | 8.3594 KOps/s | 8.3873 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_empty[True] | 0.6056ms | 0.3960ms | 2.5255 KOps/s | 2.4829 KOps/s | $\color{#35bf28}+1.72\\%$ | | test_empty[False] | 26.8220μs | 1.0414μs | 960.2482 KOps/s | 896.2196 KOps/s | $\textbf{\color{#35bf28}+7.14\\%}$ | | test_unbind_speed | 0.4550ms | 0.2437ms | 4.1029 KOps/s | 4.0438 KOps/s | $\color{#35bf28}+1.46\\%$ | | test_unbind_speed_stack0 | 0.4989ms | 0.2416ms | 4.1387 KOps/s | 4.0098 KOps/s | $\color{#35bf28}+3.22\\%$ | | test_unbind_speed_stack1 | 68.9810ms | 0.6990ms | 1.4306 KOps/s | 1.4069 KOps/s | $\color{#35bf28}+1.69\\%$ | | test_split | 69.7036ms | 1.5842ms | 631.2516 Ops/s | 613.3085 Ops/s | $\color{#35bf28}+2.93\\%$ | | test_chunk | 74.8310ms | 1.5958ms | 626.6275 Ops/s | 633.7496 Ops/s | $\color{#d91a1a}-1.12\\%$ | | test_creation[device0] | 0.2106ms | 84.3371μs | 11.8572 KOps/s | 11.6566 KOps/s | $\color{#35bf28}+1.72\\%$ | | test_creation_from_tensor | 4.3601ms | 84.8855μs | 11.7806 KOps/s | 11.5103 KOps/s | $\color{#35bf28}+2.35\\%$ | | test_add_one[memmap_tensor0] | 90.1590μs | 5.3330μs | 187.5119 KOps/s | 184.1848 KOps/s | $\color{#35bf28}+1.81\\%$ | | test_contiguous[memmap_tensor0] | 10.5500μs | 0.6257μs | 1.5982 MOps/s | 1.5314 MOps/s | $\color{#35bf28}+4.36\\%$ | | test_stack[memmap_tensor0] | 23.4440μs | 3.5978μs | 277.9483 KOps/s | 276.7384 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_memmaptd_index | 1.0193ms | 0.2595ms | 3.8530 KOps/s | 3.9011 KOps/s | $\color{#d91a1a}-1.23\\%$ | | test_memmaptd_index_astensor | 0.7272ms | 0.3323ms | 3.0095 KOps/s | 3.0511 KOps/s | $\color{#d91a1a}-1.36\\%$ | | test_memmaptd_index_op | 1.9891ms | 0.6541ms | 1.5289 KOps/s | 1.5705 KOps/s | $\color{#d91a1a}-2.65\\%$ | | test_serialize_model | 0.1638s | 0.1055s | 9.4784 Ops/s | 9.2376 Ops/s | $\color{#35bf28}+2.61\\%$ | | test_serialize_model_pickle | 0.4631s | 0.3799s | 2.6322 Ops/s | 2.6253 Ops/s | $\color{#35bf28}+0.26\\%$ | | test_serialize_weights | 0.1689s | 0.1030s | 9.7117 Ops/s | 9.5478 Ops/s | $\color{#35bf28}+1.72\\%$ | | test_serialize_weights_returnearly | 0.1286s | 0.1194s | 8.3770 Ops/s | 8.0524 Ops/s | $\color{#35bf28}+4.03\\%$ | | test_serialize_weights_pickle | 0.9912s | 0.5807s | 1.7221 Ops/s | 2.4465 Ops/s | $\textbf{\color{#d91a1a}-29.61\\%}$ | | test_serialize_weights_filesystem | 0.1601s | 96.4777ms | 10.3651 Ops/s | 9.7656 Ops/s | $\textbf{\color{#35bf28}+6.14\\%}$ | | test_serialize_model_filesystem | 0.1022s | 93.2461ms | 10.7243 Ops/s | 10.2944 Ops/s | $\color{#35bf28}+4.18\\%$ | | test_reshape_pytree | 66.3040μs | 25.5080μs | 39.2033 KOps/s | 39.1330 KOps/s | $\color{#35bf28}+0.18\\%$ | | test_reshape_td | 91.1710μs | 34.6162μs | 28.8882 KOps/s | 28.2958 KOps/s | $\color{#35bf28}+2.09\\%$ | | test_view_pytree | 83.1350μs | 25.8728μs | 38.6506 KOps/s | 39.5206 KOps/s | $\color{#d91a1a}-2.20\\%$ | | test_view_td | 76.1120μs | 39.5236μs | 25.3014 KOps/s | 24.9301 KOps/s | $\color{#35bf28}+1.49\\%$ | | test_unbind_pytree | 61.3950μs | 29.5998μs | 33.7840 KOps/s | 34.1545 KOps/s | $\color{#d91a1a}-1.08\\%$ | | test_unbind_td | 0.3614ms | 37.1584μs | 26.9118 KOps/s | 26.9653 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_split_pytree | 63.4590μs | 30.0277μs | 33.3026 KOps/s | 34.2618 KOps/s | $\color{#d91a1a}-2.80\\%$ | | test_split_td | 0.1215ms | 39.8362μs | 25.1028 KOps/s | 24.7639 KOps/s | $\color{#35bf28}+1.37\\%$ | | test_add_pytree | 98.6140μs | 35.1076μs | 28.4839 KOps/s | 28.4820 KOps/s | $+0.01\\%$ | | test_add_td | 0.1843ms | 53.8412μs | 18.5732 KOps/s | 17.1532 KOps/s | $\textbf{\color{#35bf28}+8.28\\%}$ | | test_distributed | 0.2537ms | 0.1002ms | 9.9756 KOps/s | 9.5142 KOps/s | $\color{#35bf28}+4.85\\%$ | | test_tdmodule | 71.4430μs | 18.3485μs | 54.5004 KOps/s | 53.5865 KOps/s | $\color{#35bf28}+1.71\\%$ | | test_tdmodule_dispatch | 58.1780μs | 36.1451μs | 27.6663 KOps/s | 26.6742 KOps/s | $\color{#35bf28}+3.72\\%$ | | test_tdseq | 44.2530μs | 21.2688μs | 47.0171 KOps/s | 45.1011 KOps/s | $\color{#35bf28}+4.25\\%$ | | test_tdseq_dispatch | 77.7850μs | 41.1176μs | 24.3205 KOps/s | 23.6373 KOps/s | $\color{#35bf28}+2.89\\%$ | | test_instantiation_functorch | 2.3377ms | 1.3526ms | 739.2942 Ops/s | 738.5745 Ops/s | $\color{#35bf28}+0.10\\%$ | | test_instantiation_td | 66.9990ms | 1.1000ms | 909.0920 Ops/s | 960.2242 Ops/s | $\textbf{\color{#d91a1a}-5.33\\%}$ | | test_exec_functorch | 0.2344ms | 0.1633ms | 6.1255 KOps/s | 6.0552 KOps/s | $\color{#35bf28}+1.16\\%$ | | test_exec_functional_call | 0.2207ms | 0.1496ms | 6.6833 KOps/s | 6.6826 KOps/s | $\color{#35bf28}+0.01\\%$ | | test_exec_td | 0.2249ms | 0.1482ms | 6.7487 KOps/s | 6.9591 KOps/s | $\color{#d91a1a}-3.02\\%$ | | test_exec_td_decorator | 0.9084ms | 0.2227ms | 4.4896 KOps/s | 4.4988 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_vmap_mlp_speed[True-True] | 0.6921ms | 0.4901ms | 2.0405 KOps/s | 2.0459 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_vmap_mlp_speed[True-False] | 0.8897ms | 0.4891ms | 2.0447 KOps/s | 2.0507 KOps/s | $\color{#d91a1a}-0.29\\%$ | | test_vmap_mlp_speed[False-True] | 0.6170ms | 0.3972ms | 2.5175 KOps/s | 2.5332 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_vmap_mlp_speed[False-False] | 0.6951ms | 0.3980ms | 2.5125 KOps/s | 2.5271 KOps/s | $\color{#d91a1a}-0.58\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.0930ms | 0.5624ms | 1.7781 KOps/s | 1.7736 KOps/s | $\color{#35bf28}+0.26\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.7645ms | 0.5576ms | 1.7935 KOps/s | 1.7806 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.7533ms | 0.4619ms | 2.1651 KOps/s | 2.1744 KOps/s | $\color{#d91a1a}-0.43\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8670ms | 0.4626ms | 2.1617 KOps/s | 2.1655 KOps/s | $\color{#d91a1a}-0.18\\%$ | | test_to_module_speed[True] | 2.5822ms | 1.7208ms | 581.1112 Ops/s | 587.1179 Ops/s | $\color{#d91a1a}-1.02\\%$ | | test_to_module_speed[False] | 2.6279ms | 1.7060ms | 586.1726 Ops/s | 600.3701 Ops/s | $\color{#d91a1a}-2.36\\%$ | | test_tc_init | 0.1287ms | 60.0342μs | 16.6572 KOps/s | 16.6397 KOps/s | $\color{#35bf28}+0.11\\%$ | | test_tc_init_nested | 0.2336ms | 0.1199ms | 8.3380 KOps/s | 8.8342 KOps/s | $\textbf{\color{#d91a1a}-5.62\\%}$ | | test_tc_first_layer_tensor | 24.7260μs | 8.4054μs | 118.9715 KOps/s | 120.5181 KOps/s | $\color{#d91a1a}-1.28\\%$ | | test_tc_first_layer_nontensor | 31.0980μs | 8.4087μs | 118.9249 KOps/s | 121.1239 KOps/s | $\color{#d91a1a}-1.82\\%$ | | test_tc_second_layer_tensor | 23.0930μs | 2.5496μs | 392.2120 KOps/s | 395.3542 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_tc_second_layer_nontensor | 30.0060μs | 9.3936μs | 106.4549 KOps/s | 108.0046 KOps/s | $\color{#d91a1a}-1.43\\%$ | | test_unbind | 81.3385ms | 13.7881ms | 72.5266 Ops/s | 67.6035 Ops/s | $\textbf{\color{#35bf28}+7.28\\%}$ | | test_full_like | 8.3844ms | 7.1992ms | 138.9050 Ops/s | 93.1946 Ops/s | $\textbf{\color{#35bf28}+49.05\\%}$ | | test_zeros_like | 13.8219ms | 5.9243ms | 168.7954 Ops/s | 164.1255 Ops/s | $\color{#35bf28}+2.85\\%$ | | test_ones_like | 16.0694ms | 6.4281ms | 155.5674 Ops/s | 157.9737 Ops/s | $\color{#d91a1a}-1.52\\%$ | | test_clone | 14.2716ms | 8.0666ms | 123.9686 Ops/s | 122.5354 Ops/s | $\color{#35bf28}+1.17\\%$ | | test_squeeze | 0.2761ms | 13.8450μs | 72.2280 KOps/s | 77.9391 KOps/s | $\textbf{\color{#d91a1a}-7.33\\%}$ | | test_unsqueeze | 0.2504ms | 95.5094μs | 10.4702 KOps/s | 10.1880 KOps/s | $\color{#35bf28}+2.77\\%$ | | test_split | 0.5166ms | 0.2734ms | 3.6571 KOps/s | 3.6217 KOps/s | $\color{#35bf28}+0.98\\%$ | | test_permute | 0.3702ms | 0.2242ms | 4.4603 KOps/s | 4.4240 KOps/s | $\color{#35bf28}+0.82\\%$ | | test_stack | 26.3968ms | 22.5324ms | 44.3805 Ops/s | 41.5287 Ops/s | $\textbf{\color{#35bf28}+6.87\\%}$ | | test_cat | 29.2164ms | 22.3303ms | 44.7821 Ops/s | 42.9115 Ops/s | $\color{#35bf28}+4.36\\%$ |