pytorch / tensordict

TensorDict is a pytorch dedicated tensor container.
MIT License
832 stars 74 forks source link

[Feature] map_iter #847

Closed vmoens closed 4 months ago

vmoens commented 4 months ago

Introduces map_iter, which can be used to iterate over a large dataset in a dataloader-like fashion.

TODO:

cc @shagunsodhani

github-actions[bot] commented 4 months ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}30$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | -------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 63.7520μs | 12.5146μs | 79.9064 KOps/s | 76.0669 KOps/s | $\textbf{\color{#35bf28}+5.05\\%}$ | | test_plain_set_stack_nested | 25.3610μs | 12.6179μs | 79.2527 KOps/s | 75.1401 KOps/s | $\textbf{\color{#35bf28}+5.47\\%}$ | | test_plain_set_nested_inplace | 36.1410μs | 14.0451μs | 71.1993 KOps/s | 68.9441 KOps/s | $\color{#35bf28}+3.27\\%$ | | test_plain_set_stack_nested_inplace | 37.2010μs | 13.8213μs | 72.3522 KOps/s | 69.0457 KOps/s | $\color{#35bf28}+4.79\\%$ | | test_items | 19.1300μs | 4.6989μs | 212.8159 KOps/s | 210.6254 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_items_nested | 0.4062ms | 0.3410ms | 2.9328 KOps/s | 2.9058 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_items_nested_locked | 0.4149ms | 0.3494ms | 2.8617 KOps/s | 2.9241 KOps/s | $\color{#d91a1a}-2.13\\%$ | | test_items_nested_leaf | 0.1046ms | 82.9549μs | 12.0547 KOps/s | 12.0077 KOps/s | $\color{#35bf28}+0.39\\%$ | | test_items_stack_nested | 0.4094ms | 0.3491ms | 2.8643 KOps/s | 2.9052 KOps/s | $\color{#d91a1a}-1.41\\%$ | | test_items_stack_nested_leaf | 0.1158ms | 85.6735μs | 11.6722 KOps/s | 11.8382 KOps/s | $\color{#d91a1a}-1.40\\%$ | | test_items_stack_nested_locked | 0.4097ms | 0.3424ms | 2.9205 KOps/s | 2.9109 KOps/s | $\color{#35bf28}+0.33\\%$ | | test_keys | 25.3310μs | 4.4011μs | 227.2155 KOps/s | 226.7923 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_keys_nested | 0.1000ms | 71.6102μs | 13.9645 KOps/s | 14.2691 KOps/s | $\color{#d91a1a}-2.13\\%$ | | test_keys_nested_locked | 2.6041ms | 77.3463μs | 12.9289 KOps/s | 13.0961 KOps/s | $\color{#d91a1a}-1.28\\%$ | | test_keys_nested_leaf | 89.7320μs | 62.5237μs | 15.9939 KOps/s | 16.4926 KOps/s | $\color{#d91a1a}-3.02\\%$ | | test_keys_stack_nested | 0.1000ms | 71.4796μs | 13.9900 KOps/s | 14.2101 KOps/s | $\color{#d91a1a}-1.55\\%$ | | test_keys_stack_nested_leaf | 87.7510μs | 62.4593μs | 16.0104 KOps/s | 16.8350 KOps/s | $\color{#d91a1a}-4.90\\%$ | | test_keys_stack_nested_locked | 0.1036ms | 76.8457μs | 13.0131 KOps/s | 13.1908 KOps/s | $\color{#d91a1a}-1.35\\%$ | | test_values | 14.3837μs | 1.8092μs | 552.7361 KOps/s | 552.4149 KOps/s | $\color{#35bf28}+0.06\\%$ | | test_values_nested | 61.7810μs | 35.9679μs | 27.8026 KOps/s | 28.2553 KOps/s | $\color{#d91a1a}-1.60\\%$ | | test_values_nested_locked | 67.7010μs | 37.7581μs | 26.4844 KOps/s | 26.8193 KOps/s | $\color{#d91a1a}-1.25\\%$ | | test_values_nested_leaf | 47.6710μs | 32.0375μs | 31.2134 KOps/s | 31.6756 KOps/s | $\color{#d91a1a}-1.46\\%$ | | test_values_stack_nested | 62.3400μs | 36.6881μs | 27.2568 KOps/s | 27.6861 KOps/s | $\color{#d91a1a}-1.55\\%$ | | test_values_stack_nested_leaf | 63.8910μs | 32.8471μs | 30.4441 KOps/s | 31.1335 KOps/s | $\color{#d91a1a}-2.21\\%$ | | test_values_stack_nested_locked | 64.2020μs | 38.7178μs | 25.8279 KOps/s | 26.3333 KOps/s | $\color{#d91a1a}-1.92\\%$ | | test_membership | 3.7786μs | 0.7393μs | 1.3526 MOps/s | 1.4235 MOps/s | $\color{#d91a1a}-4.98\\%$ | | test_membership_nested | 20.6800μs | 2.6773μs | 373.5057 KOps/s | 382.4920 KOps/s | $\color{#d91a1a}-2.35\\%$ | | test_membership_nested_leaf | 29.9300μs | 2.6577μs | 376.2599 KOps/s | 383.0281 KOps/s | $\color{#d91a1a}-1.77\\%$ | | test_membership_stacked_nested | 24.3110μs | 2.6088μs | 383.3165 KOps/s | 380.8919 KOps/s | $\color{#35bf28}+0.64\\%$ | | test_membership_stacked_nested_leaf | 22.9410μs | 2.6455μs | 377.9947 KOps/s | 385.4738 KOps/s | $\color{#d91a1a}-1.94\\%$ | | test_membership_nested_last | 46.7610μs | 3.2324μs | 309.3696 KOps/s | 320.6686 KOps/s | $\color{#d91a1a}-3.52\\%$ | | test_membership_nested_leaf_last | 33.2000μs | 3.1878μs | 313.6922 KOps/s | 319.3595 KOps/s | $\color{#d91a1a}-1.77\\%$ | | test_membership_stacked_nested_last | 24.3210μs | 3.1711μs | 315.3503 KOps/s | 316.4929 KOps/s | $\color{#d91a1a}-0.36\\%$ | | test_membership_stacked_nested_leaf_last | 26.8110μs | 3.1912μs | 313.3631 KOps/s | 318.4042 KOps/s | $\color{#d91a1a}-1.58\\%$ | | test_nested_getleaf | 37.6710μs | 8.4815μs | 117.9041 KOps/s | 119.8530 KOps/s | $\color{#d91a1a}-1.63\\%$ | | test_nested_get | 29.0000μs | 7.9026μs | 126.5405 KOps/s | 127.4950 KOps/s | $\color{#d91a1a}-0.75\\%$ | | test_stacked_getleaf | 25.3710μs | 8.4217μs | 118.7415 KOps/s | 119.7226 KOps/s | $\color{#d91a1a}-0.82\\%$ | | test_stacked_get | 37.0810μs | 7.9080μs | 126.4542 KOps/s | 126.8240 KOps/s | $\color{#d91a1a}-0.29\\%$ | | test_nested_getitemleaf | 23.8790μs | 8.6456μs | 115.6664 KOps/s | 116.7450 KOps/s | $\color{#d91a1a}-0.92\\%$ | | test_nested_getitem | 28.8110μs | 8.1548μs | 122.6272 KOps/s | 124.4115 KOps/s | $\color{#d91a1a}-1.43\\%$ | | test_stacked_getitemleaf | 34.2800μs | 8.5805μs | 116.5433 KOps/s | 116.7925 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_stacked_getitem | 24.4310μs | 8.0669μs | 123.9638 KOps/s | 123.6757 KOps/s | $\color{#35bf28}+0.23\\%$ | | test_lock_nested | 58.7218ms | 0.4012ms | 2.4927 KOps/s | 2.4865 KOps/s | $\color{#35bf28}+0.25\\%$ | | test_lock_stack_nested | 0.3480ms | 0.2957ms | 3.3817 KOps/s | 3.2867 KOps/s | $\color{#35bf28}+2.89\\%$ | | test_unlock_nested | 60.9149ms | 0.4033ms | 2.4794 KOps/s | 2.4558 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_unlock_stack_nested | 0.3592ms | 0.3047ms | 3.2816 KOps/s | 3.2033 KOps/s | $\color{#35bf28}+2.45\\%$ | | test_flatten_speed | 0.3762ms | 0.1029ms | 9.7200 KOps/s | 9.8773 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_unflatten_speed | 0.3486ms | 0.2945ms | 3.3960 KOps/s | 3.4166 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_common_ops | 1.0437ms | 0.5784ms | 1.7289 KOps/s | 1.6458 KOps/s | $\textbf{\color{#35bf28}+5.05\\%}$ | | test_creation | 27.0500μs | 1.6613μs | 601.9400 KOps/s | 616.1116 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_creation_empty | 26.3800μs | 8.0658μs | 123.9803 KOps/s | 106.8967 KOps/s | $\textbf{\color{#35bf28}+15.98\\%}$ | | test_creation_nested_1 | 26.6410μs | 9.8336μs | 101.6925 KOps/s | 89.8939 KOps/s | $\textbf{\color{#35bf28}+13.13\\%}$ | | test_creation_nested_2 | 40.2510μs | 12.0232μs | 83.1725 KOps/s | 75.2748 KOps/s | $\textbf{\color{#35bf28}+10.49\\%}$ | | test_clone | 66.9010μs | 11.6848μs | 85.5810 KOps/s | 83.6803 KOps/s | $\color{#35bf28}+2.27\\%$ | | test_getitem[int] | 26.7310μs | 10.7408μs | 93.1033 KOps/s | 92.8184 KOps/s | $\color{#35bf28}+0.31\\%$ | | test_getitem[slice_int] | 65.9020μs | 20.8210μs | 48.0283 KOps/s | 46.7376 KOps/s | $\color{#35bf28}+2.76\\%$ | | test_getitem[range] | 67.6010μs | 48.1168μs | 20.7828 KOps/s | 19.1824 KOps/s | $\textbf{\color{#35bf28}+8.34\\%}$ | | test_getitem[tuple] | 41.5910μs | 18.5966μs | 53.7732 KOps/s | 52.4425 KOps/s | $\color{#35bf28}+2.54\\%$ | | test_getitem[list] | 0.1542ms | 33.6336μs | 29.7322 KOps/s | 28.9395 KOps/s | $\color{#35bf28}+2.74\\%$ | | test_setitem_dim[int] | 65.3010μs | 26.7173μs | 37.4289 KOps/s | 35.2625 KOps/s | $\textbf{\color{#35bf28}+6.14\\%}$ | | test_setitem_dim[slice_int] | 78.5110μs | 47.2672μs | 21.1563 KOps/s | 19.7210 KOps/s | $\textbf{\color{#35bf28}+7.28\\%}$ | | test_setitem_dim[range] | 0.1026ms | 65.8386μs | 15.1886 KOps/s | 14.5121 KOps/s | $\color{#35bf28}+4.66\\%$ | | test_setitem_dim[tuple] | 71.3520μs | 42.0789μs | 23.7649 KOps/s | 22.6364 KOps/s | $\color{#35bf28}+4.99\\%$ | | test_setitem | 44.8310μs | 16.0038μs | 62.4850 KOps/s | 58.1084 KOps/s | $\textbf{\color{#35bf28}+7.53\\%}$ | | test_set | 54.5010μs | 15.2234μs | 65.6882 KOps/s | 60.5788 KOps/s | $\textbf{\color{#35bf28}+8.43\\%}$ | | test_set_shared | 1.6930ms | 99.7999μs | 10.0200 KOps/s | 9.8766 KOps/s | $\color{#35bf28}+1.45\\%$ | | test_update | 62.9710μs | 17.9667μs | 55.6584 KOps/s | 51.8068 KOps/s | $\textbf{\color{#35bf28}+7.43\\%}$ | | test_update_nested | 62.8110μs | 23.4359μs | 42.6695 KOps/s | 40.1349 KOps/s | $\textbf{\color{#35bf28}+6.32\\%}$ | | test_update__nested | 62.2610μs | 22.1748μs | 45.0961 KOps/s | 43.7365 KOps/s | $\color{#35bf28}+3.11\\%$ | | test_set_nested | 49.1110μs | 16.2183μs | 61.6588 KOps/s | 57.8267 KOps/s | $\textbf{\color{#35bf28}+6.63\\%}$ | | test_set_nested_new | 71.3020μs | 19.1057μs | 52.3403 KOps/s | 49.4257 KOps/s | $\textbf{\color{#35bf28}+5.90\\%}$ | | test_select | 68.0020μs | 31.6155μs | 31.6300 KOps/s | 29.8521 KOps/s | $\textbf{\color{#35bf28}+5.96\\%}$ | | test_select_nested | 0.7850ms | 53.4337μs | 18.7148 KOps/s | 19.4124 KOps/s | $\color{#d91a1a}-3.59\\%$ | | test_exclude_nested | 0.1577ms | 0.1113ms | 8.9881 KOps/s | 9.3611 KOps/s | $\color{#d91a1a}-3.99\\%$ | | test_empty[True] | 0.4157ms | 0.3486ms | 2.8682 KOps/s | 2.9202 KOps/s | $\color{#d91a1a}-1.78\\%$ | | test_empty[False] | 2.6141μs | 0.8355μs | 1.1969 MOps/s | 1.2449 MOps/s | $\color{#d91a1a}-3.86\\%$ | | test_to | 91.3010μs | 58.9700μs | 16.9578 KOps/s | 16.5755 KOps/s | $\color{#35bf28}+2.31\\%$ | | test_to_nonblocking | 87.5420μs | 35.4702μs | 28.1927 KOps/s | 26.7602 KOps/s | $\textbf{\color{#35bf28}+5.35\\%}$ | | test_unbind_speed | 0.9104ms | 0.2555ms | 3.9138 KOps/s | 3.7691 KOps/s | $\color{#35bf28}+3.84\\%$ | | test_unbind_speed_stack0 | 0.3190ms | 0.2574ms | 3.8848 KOps/s | 3.8064 KOps/s | $\color{#35bf28}+2.06\\%$ | | test_unbind_speed_stack1 | 77.3689ms | 0.7834ms | 1.2764 KOps/s | 1.2655 KOps/s | $\color{#35bf28}+0.86\\%$ | | test_split | 76.2189ms | 1.6348ms | 611.6772 Ops/s | 577.2157 Ops/s | $\textbf{\color{#35bf28}+5.97\\%}$ | | test_chunk | 76.2292ms | 1.6420ms | 609.0240 Ops/s | 630.0154 Ops/s | $\color{#d91a1a}-3.33\\%$ | | test_creation[device0] | 0.1248ms | 58.8131μs | 17.0030 KOps/s | 16.9950 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_creation_from_tensor | 0.1236ms | 54.5333μs | 18.3374 KOps/s | 18.4876 KOps/s | $\color{#d91a1a}-0.81\\%$ | | test_add_one[memmap_tensor0] | 97.5120μs | 7.0912μs | 141.0196 KOps/s | 130.0235 KOps/s | $\textbf{\color{#35bf28}+8.46\\%}$ | | test_contiguous[memmap_tensor0] | 9.4800μs | 0.6684μs | 1.4960 MOps/s | 1.4660 MOps/s | $\color{#35bf28}+2.05\\%$ | | test_stack[memmap_tensor0] | 34.8010μs | 4.7464μs | 210.6863 KOps/s | 194.1663 KOps/s | $\textbf{\color{#35bf28}+8.51\\%}$ | | test_memmaptd_index | 1.1570ms | 0.2799ms | 3.5725 KOps/s | 3.4527 KOps/s | $\color{#35bf28}+3.47\\%$ | | test_memmaptd_index_astensor | 0.6843ms | 0.3381ms | 2.9577 KOps/s | 2.6250 KOps/s | $\textbf{\color{#35bf28}+12.68\\%}$ | | test_memmaptd_index_op | 1.0566ms | 0.6248ms | 1.6005 KOps/s | 1.4541 KOps/s | $\textbf{\color{#35bf28}+10.07\\%}$ | | test_serialize_model | 93.0669ms | 90.0852ms | 11.1006 Ops/s | 10.4733 Ops/s | $\textbf{\color{#35bf28}+5.99\\%}$ | | test_serialize_model_pickle | 1.3670s | 1.2384s | 0.8075 Ops/s | 0.8062 Ops/s | $\color{#35bf28}+0.16\\%$ | | test_serialize_weights | 93.8568ms | 89.1809ms | 11.2132 Ops/s | 10.6053 Ops/s | $\textbf{\color{#35bf28}+5.73\\%}$ | | test_serialize_weights_returnearly | 0.2654s | 77.1207ms | 12.9667 Ops/s | 13.1958 Ops/s | $\color{#d91a1a}-1.74\\%$ | | test_serialize_weights_pickle | 1.3515s | 1.2488s | 0.8008 Ops/s | 0.8090 Ops/s | $\color{#d91a1a}-1.02\\%$ | | test_reshape_pytree | 57.2410μs | 26.1475μs | 38.2445 KOps/s | 37.1275 KOps/s | $\color{#35bf28}+3.01\\%$ | | test_reshape_td | 62.3410μs | 33.3695μs | 29.9675 KOps/s | 31.0355 KOps/s | $\color{#d91a1a}-3.44\\%$ | | test_view_pytree | 0.2124ms | 26.0340μs | 38.4113 KOps/s | 37.9105 KOps/s | $\color{#35bf28}+1.32\\%$ | | test_view_td | 62.8210μs | 36.1330μs | 27.6755 KOps/s | 26.4218 KOps/s | $\color{#35bf28}+4.75\\%$ | | test_unbind_pytree | 0.2374ms | 32.2071μs | 31.0490 KOps/s | 30.4506 KOps/s | $\color{#35bf28}+1.97\\%$ | | test_unbind_td | 0.4481ms | 39.7887μs | 25.1328 KOps/s | 24.7673 KOps/s | $\color{#35bf28}+1.48\\%$ | | test_split_pytree | 60.1420μs | 34.8682μs | 28.6795 KOps/s | 27.5881 KOps/s | $\color{#35bf28}+3.96\\%$ | | test_split_td | 0.5010ms | 40.6634μs | 24.5922 KOps/s | 25.0180 KOps/s | $\color{#d91a1a}-1.70\\%$ | | test_add_pytree | 0.2449ms | 38.4132μs | 26.0327 KOps/s | 24.5716 KOps/s | $\textbf{\color{#35bf28}+5.95\\%}$ | | test_add_td | 90.8210μs | 54.2046μs | 18.4486 KOps/s | 17.7732 KOps/s | $\color{#35bf28}+3.80\\%$ | | test_distributed | 0.2280ms | 71.0428μs | 14.0760 KOps/s | 14.4410 KOps/s | $\color{#d91a1a}-2.53\\%$ | | test_tdmodule | 83.7320μs | 15.1024μs | 66.2148 KOps/s | 64.0319 KOps/s | $\color{#35bf28}+3.41\\%$ | | test_tdmodule_dispatch | 47.4810μs | 28.5548μs | 35.0204 KOps/s | 32.7936 KOps/s | $\textbf{\color{#35bf28}+6.79\\%}$ | | test_tdseq | 31.8110μs | 16.4018μs | 60.9691 KOps/s | 57.7889 KOps/s | $\textbf{\color{#35bf28}+5.50\\%}$ | | test_tdseq_dispatch | 52.3810μs | 31.6241μs | 31.6215 KOps/s | 30.1897 KOps/s | $\color{#35bf28}+4.74\\%$ | | test_instantiation_functorch | 1.5532ms | 1.4108ms | 708.8370 Ops/s | 706.5913 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_instantiation_td | 80.5962ms | 1.0808ms | 925.2020 Ops/s | 925.9221 Ops/s | $\color{#d91a1a}-0.08\\%$ | | test_exec_functorch | 0.1824ms | 0.1483ms | 6.7424 KOps/s | 6.5813 KOps/s | $\color{#35bf28}+2.45\\%$ | | test_exec_functional_call | 0.1922ms | 0.1398ms | 7.1513 KOps/s | 6.9485 KOps/s | $\color{#35bf28}+2.92\\%$ | | test_exec_td | 0.1973ms | 0.1360ms | 7.3531 KOps/s | 7.0525 KOps/s | $\color{#35bf28}+4.26\\%$ | | test_exec_td_decorator | 0.3205ms | 0.2104ms | 4.7525 KOps/s | 4.7069 KOps/s | $\color{#35bf28}+0.97\\%$ | | test_vmap_mlp_speed[True-True] | 0.6787ms | 0.5740ms | 1.7422 KOps/s | 1.7021 KOps/s | $\color{#35bf28}+2.35\\%$ | | test_vmap_mlp_speed[True-False] | 0.6681ms | 0.5754ms | 1.7380 KOps/s | 1.7045 KOps/s | $\color{#35bf28}+1.96\\%$ | | test_vmap_mlp_speed[False-True] | 0.5621ms | 0.5079ms | 1.9689 KOps/s | 1.9554 KOps/s | $\color{#35bf28}+0.69\\%$ | | test_vmap_mlp_speed[False-False] | 0.5996ms | 0.5069ms | 1.9726 KOps/s | 1.8713 KOps/s | $\textbf{\color{#35bf28}+5.42\\%}$ | | test_vmap_mlp_speed_decorator[True-True] | 1.3753ms | 0.6432ms | 1.5547 KOps/s | 1.2481 KOps/s | $\textbf{\color{#35bf28}+24.56\\%}$ | | test_vmap_mlp_speed_decorator[True-False] | 0.7683ms | 0.6340ms | 1.5772 KOps/s | 1.5501 KOps/s | $\color{#35bf28}+1.75\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.6829ms | 0.5658ms | 1.7673 KOps/s | 1.7553 KOps/s | $\color{#35bf28}+0.69\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.7020ms | 0.5674ms | 1.7624 KOps/s | 1.7550 KOps/s | $\color{#35bf28}+0.43\\%$ | | test_vmap_transformer_speed[True-True] | 8.2454ms | 7.7152ms | 129.6142 Ops/s | 129.9454 Ops/s | $\color{#d91a1a}-0.25\\%$ | | test_vmap_transformer_speed[True-False] | 7.9952ms | 7.6593ms | 130.5600 Ops/s | 130.2056 Ops/s | $\color{#35bf28}+0.27\\%$ | | test_vmap_transformer_speed[False-True] | 7.7380ms | 7.5495ms | 132.4589 Ops/s | 131.5313 Ops/s | $\color{#35bf28}+0.71\\%$ | | test_vmap_transformer_speed[False-False] | 7.6524ms | 7.5419ms | 132.5922 Ops/s | 131.5047 Ops/s | $\color{#35bf28}+0.83\\%$ | | test_vmap_transformer_speed_decorator[True-True] | 19.6062ms | 18.5241ms | 53.9836 Ops/s | 53.5856 Ops/s | $\color{#35bf28}+0.74\\%$ | | test_vmap_transformer_speed_decorator[True-False] | 18.9552ms | 18.5910ms | 53.7895 Ops/s | 53.7730 Ops/s | $\color{#35bf28}+0.03\\%$ | | test_vmap_transformer_speed_decorator[False-True] | 19.7911ms | 18.5309ms | 53.9640 Ops/s | 54.0735 Ops/s | $\color{#d91a1a}-0.20\\%$ | | test_vmap_transformer_speed_decorator[False-False] | 18.6737ms | 18.4334ms | 54.2494 Ops/s | 54.0269 Ops/s | $\color{#35bf28}+0.41\\%$ | | test_to_module_speed[True] | 2.1611ms | 1.5077ms | 663.2655 Ops/s | 668.8355 Ops/s | $\color{#d91a1a}-0.83\\%$ | | test_to_module_speed[False] | 1.5749ms | 1.4662ms | 682.0562 Ops/s | 673.3323 Ops/s | $\color{#35bf28}+1.30\\%$ | | test_tc_init | 90.1920μs | 50.4082μs | 19.8380 KOps/s | 18.7394 KOps/s | $\textbf{\color{#35bf28}+5.86\\%}$ | | test_tc_init_nested | 0.1590ms | 99.5660μs | 10.0436 KOps/s | 9.8759 KOps/s | $\color{#35bf28}+1.70\\%$ | | test_tc_first_layer_tensor | 18.1300μs | 3.8648μs | 258.7436 KOps/s | 261.6362 KOps/s | $\color{#d91a1a}-1.11\\%$ | | test_tc_first_layer_nontensor | 19.3200μs | 3.9135μs | 255.5275 KOps/s | 270.3918 KOps/s | $\textbf{\color{#d91a1a}-5.50\\%}$ | | test_tc_second_layer_tensor | 5.5050μs | 1.2480μs | 801.2590 KOps/s | 778.7677 KOps/s | $\color{#35bf28}+2.89\\%$ | | test_tc_second_layer_nontensor | 21.4710μs | 4.4635μs | 224.0371 KOps/s | 230.6451 KOps/s | $\color{#d91a1a}-2.87\\%$ | | test_unbind | 0.1094s | 13.7768ms | 72.5858 Ops/s | 74.5284 Ops/s | $\color{#d91a1a}-2.61\\%$ | | test_full_like | 13.9661ms | 13.6029ms | 73.5138 Ops/s | 105.2127 Ops/s | $\textbf{\color{#d91a1a}-30.13\\%}$ | | test_zeros_like | 8.2825ms | 8.0008ms | 124.9878 Ops/s | 123.7499 Ops/s | $\color{#35bf28}+1.00\\%$ | | test_ones_like | 8.1518ms | 7.9803ms | 125.3089 Ops/s | 124.4342 Ops/s | $\color{#35bf28}+0.70\\%$ | | test_clone | 9.7665ms | 9.6049ms | 104.1132 Ops/s | 102.6436 Ops/s | $\color{#35bf28}+1.43\\%$ | | test_squeeze | 62.1710μs | 10.7217μs | 93.2688 KOps/s | 93.7287 KOps/s | $\color{#d91a1a}-0.49\\%$ | | test_unsqueeze | 0.1603ms | 86.4200μs | 11.5714 KOps/s | 11.5332 KOps/s | $\color{#35bf28}+0.33\\%$ | | test_split | 3.5038ms | 3.1830ms | 314.1713 Ops/s | 318.4357 Ops/s | $\color{#d91a1a}-1.34\\%$ | | test_permute | 0.2993ms | 0.2031ms | 4.9237 KOps/s | 4.8584 KOps/s | $\color{#35bf28}+1.34\\%$ | | test_stack | 27.9183ms | 27.5362ms | 36.3158 Ops/s | 35.8208 Ops/s | $\color{#35bf28}+1.38\\%$ | | test_cat | 27.3764ms | 27.2497ms | 36.6977 Ops/s | 35.9471 Ops/s | $\color{#35bf28}+2.09\\%$ |
github-actions[bot] commented 4 months ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}25$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------ | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 58.9000μs | 17.1057μs | 58.4602 KOps/s | 63.7008 KOps/s | $\textbf{\color{#d91a1a}-8.23\\%}$ | | test_plain_set_stack_nested | 42.3190μs | 17.2721μs | 57.8969 KOps/s | 63.3677 KOps/s | $\textbf{\color{#d91a1a}-8.63\\%}$ | | test_plain_set_nested_inplace | 65.8840μs | 19.6232μs | 50.9602 KOps/s | 55.7525 KOps/s | $\textbf{\color{#d91a1a}-8.60\\%}$ | | test_plain_set_stack_nested_inplace | 59.3010μs | 19.4781μs | 51.3396 KOps/s | 55.7285 KOps/s | $\textbf{\color{#d91a1a}-7.88\\%}$ | | test_items | 43.9020μs | 2.6470μs | 377.7803 KOps/s | 390.3512 KOps/s | $\color{#d91a1a}-3.22\\%$ | | test_items_nested | 0.5119ms | 0.2749ms | 3.6373 KOps/s | 3.6032 KOps/s | $\color{#35bf28}+0.95\\%$ | | test_items_nested_locked | 0.4711ms | 0.2754ms | 3.6307 KOps/s | 3.5192 KOps/s | $\color{#35bf28}+3.17\\%$ | | test_items_nested_leaf | 0.1624ms | 79.9692μs | 12.5048 KOps/s | 12.5520 KOps/s | $\color{#d91a1a}-0.38\\%$ | | test_items_stack_nested | 0.3789ms | 0.2756ms | 3.6290 KOps/s | 3.5227 KOps/s | $\color{#35bf28}+3.02\\%$ | | test_items_stack_nested_leaf | 0.1542ms | 78.2781μs | 12.7750 KOps/s | 12.3324 KOps/s | $\color{#35bf28}+3.59\\%$ | | test_items_stack_nested_locked | 0.5139ms | 0.2746ms | 3.6413 KOps/s | 3.5433 KOps/s | $\color{#35bf28}+2.77\\%$ | | test_keys | 25.3370μs | 3.8345μs | 260.7872 KOps/s | 248.4669 KOps/s | $\color{#35bf28}+4.96\\%$ | | test_keys_nested | 0.3004ms | 0.1407ms | 7.1057 KOps/s | 7.2455 KOps/s | $\color{#d91a1a}-1.93\\%$ | | test_keys_nested_locked | 0.7808ms | 0.1458ms | 6.8606 KOps/s | 6.9597 KOps/s | $\color{#d91a1a}-1.42\\%$ | | test_keys_nested_leaf | 0.3089ms | 0.1211ms | 8.2577 KOps/s | 8.5205 KOps/s | $\color{#d91a1a}-3.08\\%$ | | test_keys_stack_nested | 0.2735ms | 0.1370ms | 7.2975 KOps/s | 7.1705 KOps/s | $\color{#35bf28}+1.77\\%$ | | test_keys_stack_nested_leaf | 0.2340ms | 0.1164ms | 8.5893 KOps/s | 8.6371 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_keys_stack_nested_locked | 0.2638ms | 0.1414ms | 7.0702 KOps/s | 6.9973 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_values | 12.5185μs | 1.1494μs | 870.0091 KOps/s | 855.3076 KOps/s | $\color{#35bf28}+1.72\\%$ | | test_values_nested | 0.1039ms | 50.9384μs | 19.6316 KOps/s | 19.4091 KOps/s | $\color{#35bf28}+1.15\\%$ | | test_values_nested_locked | 0.1044ms | 51.2236μs | 19.5223 KOps/s | 19.3444 KOps/s | $\color{#35bf28}+0.92\\%$ | | test_values_nested_leaf | 93.3450μs | 46.2199μs | 21.6357 KOps/s | 21.5739 KOps/s | $\color{#35bf28}+0.29\\%$ | | test_values_stack_nested | 0.1515ms | 51.6446μs | 19.3631 KOps/s | 19.3680 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_values_stack_nested_leaf | 93.9460μs | 45.4326μs | 22.0106 KOps/s | 21.6113 KOps/s | $\color{#35bf28}+1.85\\%$ | | test_values_stack_nested_locked | 0.1319ms | 51.9403μs | 19.2529 KOps/s | 19.4385 KOps/s | $\color{#d91a1a}-0.95\\%$ | | test_membership | 51.4570μs | 1.3857μs | 721.6734 KOps/s | 741.3625 KOps/s | $\color{#d91a1a}-2.66\\%$ | | test_membership_nested | 44.6330μs | 3.4380μs | 290.8688 KOps/s | 286.3512 KOps/s | $\color{#35bf28}+1.58\\%$ | | test_membership_nested_leaf | 46.1070μs | 3.4248μs | 291.9857 KOps/s | 286.7711 KOps/s | $\color{#35bf28}+1.82\\%$ | | test_membership_stacked_nested | 35.3670μs | 3.4536μs | 289.5520 KOps/s | 290.9552 KOps/s | $\color{#d91a1a}-0.48\\%$ | | test_membership_stacked_nested_leaf | 34.5550μs | 3.4367μs | 290.9773 KOps/s | 287.6806 KOps/s | $\color{#35bf28}+1.15\\%$ | | test_membership_nested_last | 52.4280μs | 4.3133μs | 231.8413 KOps/s | 234.3451 KOps/s | $\color{#d91a1a}-1.07\\%$ | | test_membership_nested_leaf_last | 36.7800μs | 4.2433μs | 235.6636 KOps/s | 233.8542 KOps/s | $\color{#35bf28}+0.77\\%$ | | test_membership_stacked_nested_last | 32.7010μs | 13.5675μs | 73.7057 KOps/s | 236.9840 KOps/s | $\textbf{\color{#d91a1a}-68.90\\%}$ | | test_membership_stacked_nested_leaf_last | 36.0580μs | 13.5298μs | 73.9109 KOps/s | 233.7861 KOps/s | $\textbf{\color{#d91a1a}-68.39\\%}$ | | test_nested_getleaf | 49.3730μs | 10.9402μs | 91.4059 KOps/s | 94.6463 KOps/s | $\color{#d91a1a}-3.42\\%$ | | test_nested_get | 44.1540μs | 10.3342μs | 96.7664 KOps/s | 99.6234 KOps/s | $\color{#d91a1a}-2.87\\%$ | | test_stacked_getleaf | 51.6370μs | 10.7039μs | 93.4237 KOps/s | 94.2719 KOps/s | $\color{#d91a1a}-0.90\\%$ | | test_stacked_get | 37.2700μs | 10.0764μs | 99.2416 KOps/s | 100.1364 KOps/s | $\color{#d91a1a}-0.89\\%$ | | test_nested_getitemleaf | 53.8010μs | 11.4173μs | 87.5862 KOps/s | 88.4572 KOps/s | $\color{#d91a1a}-0.98\\%$ | | test_nested_getitem | 49.8840μs | 10.5097μs | 95.1506 KOps/s | 95.8483 KOps/s | $\color{#d91a1a}-0.73\\%$ | | test_stacked_getitemleaf | 36.8090μs | 11.1374μs | 89.7879 KOps/s | 89.2341 KOps/s | $\color{#35bf28}+0.62\\%$ | | test_stacked_getitem | 42.4590μs | 10.3654μs | 96.4746 KOps/s | 96.1323 KOps/s | $\color{#35bf28}+0.36\\%$ | | test_lock_nested | 0.9839ms | 0.3430ms | 2.9153 KOps/s | 2.8147 KOps/s | $\color{#35bf28}+3.57\\%$ | | test_lock_stack_nested | 0.5683ms | 0.2968ms | 3.3697 KOps/s | 3.1561 KOps/s | $\textbf{\color{#35bf28}+6.77\\%}$ | | test_unlock_nested | 0.8595ms | 0.3542ms | 2.8229 KOps/s | 2.7743 KOps/s | $\color{#35bf28}+1.75\\%$ | | test_unlock_stack_nested | 0.4546ms | 0.3064ms | 3.2642 KOps/s | 3.1444 KOps/s | $\color{#35bf28}+3.81\\%$ | | test_flatten_speed | 0.6043ms | 98.8052μs | 10.1209 KOps/s | 10.1087 KOps/s | $\color{#35bf28}+0.12\\%$ | | test_unflatten_speed | 0.7208ms | 0.4144ms | 2.4132 KOps/s | 2.4077 KOps/s | $\color{#35bf28}+0.23\\%$ | | test_common_ops | 3.7448ms | 0.7221ms | 1.3848 KOps/s | 1.4417 KOps/s | $\color{#d91a1a}-3.95\\%$ | | test_creation | 19.2570μs | 1.9473μs | 513.5198 KOps/s | 518.7527 KOps/s | $\color{#d91a1a}-1.01\\%$ | | test_creation_empty | 42.6400μs | 10.7213μs | 93.2725 KOps/s | 123.3467 KOps/s | $\textbf{\color{#d91a1a}-24.38\\%}$ | | test_creation_nested_1 | 61.6460μs | 13.2966μs | 75.2073 KOps/s | 93.8562 KOps/s | $\textbf{\color{#d91a1a}-19.87\\%}$ | | test_creation_nested_2 | 66.1550μs | 16.6806μs | 59.9499 KOps/s | 71.0710 KOps/s | $\textbf{\color{#d91a1a}-15.65\\%}$ | | test_clone | 1.2750ms | 13.1228μs | 76.2035 KOps/s | 74.9810 KOps/s | $\color{#35bf28}+1.63\\%$ | | test_getitem[int] | 65.3830μs | 11.5989μs | 86.2154 KOps/s | 86.4798 KOps/s | $\color{#d91a1a}-0.31\\%$ | | test_getitem[slice_int] | 70.9440μs | 22.2206μs | 45.0032 KOps/s | 43.1796 KOps/s | $\color{#35bf28}+4.22\\%$ | | test_getitem[range] | 83.8380μs | 60.6558μs | 16.4865 KOps/s | 16.8829 KOps/s | $\color{#d91a1a}-2.35\\%$ | | test_getitem[tuple] | 73.5080μs | 18.6124μs | 53.7275 KOps/s | 53.8367 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_getitem[list] | 0.1726ms | 41.6085μs | 24.0335 KOps/s | 24.3868 KOps/s | $\color{#d91a1a}-1.45\\%$ | | test_setitem_dim[int] | 72.4960μs | 34.0027μs | 29.4094 KOps/s | 32.8092 KOps/s | $\textbf{\color{#d91a1a}-10.36\\%}$ | | test_setitem_dim[slice_int] | 94.0070μs | 61.5389μs | 16.2499 KOps/s | 17.0330 KOps/s | $\color{#d91a1a}-4.60\\%$ | | test_setitem_dim[range] | 0.1498ms | 85.8370μs | 11.6500 KOps/s | 12.2866 KOps/s | $\textbf{\color{#d91a1a}-5.18\\%}$ | | test_setitem_dim[tuple] | 0.1148ms | 51.2995μs | 19.4934 KOps/s | 21.6858 KOps/s | $\textbf{\color{#d91a1a}-10.11\\%}$ | | test_setitem | 65.8740μs | 19.5713μs | 51.0952 KOps/s | 52.9776 KOps/s | $\color{#d91a1a}-3.55\\%$ | | test_set | 65.8530μs | 19.0443μs | 52.5091 KOps/s | 55.9312 KOps/s | $\textbf{\color{#d91a1a}-6.12\\%}$ | | test_set_shared | 76.8877ms | 0.1693ms | 5.9082 KOps/s | 6.7713 KOps/s | $\textbf{\color{#d91a1a}-12.75\\%}$ | | test_update | 0.1691ms | 22.4156μs | 44.6118 KOps/s | 52.1519 KOps/s | $\textbf{\color{#d91a1a}-14.46\\%}$ | | test_update_nested | 0.1161ms | 30.5285μs | 32.7562 KOps/s | 36.1973 KOps/s | $\textbf{\color{#d91a1a}-9.51\\%}$ | | test_update__nested | 58.4800μs | 24.6076μs | 40.6378 KOps/s | 39.6609 KOps/s | $\color{#35bf28}+2.46\\%$ | | test_set_nested | 88.5560μs | 21.0675μs | 47.4664 KOps/s | 50.6438 KOps/s | $\textbf{\color{#d91a1a}-6.27\\%}$ | | test_set_nested_new | 84.8590μs | 25.0078μs | 39.9876 KOps/s | 41.5468 KOps/s | $\color{#d91a1a}-3.75\\%$ | | test_select | 0.1180ms | 40.3079μs | 24.8090 KOps/s | 25.1883 KOps/s | $\color{#d91a1a}-1.51\\%$ | | test_select_nested | 0.1382ms | 58.6807μs | 17.0414 KOps/s | 17.4770 KOps/s | $\color{#d91a1a}-2.49\\%$ | | test_exclude_nested | 0.2280ms | 0.1211ms | 8.2573 KOps/s | 8.4930 KOps/s | $\color{#d91a1a}-2.77\\%$ | | test_empty[True] | 0.5151ms | 0.3992ms | 2.5051 KOps/s | 2.5413 KOps/s | $\color{#d91a1a}-1.42\\%$ | | test_empty[False] | 6.7226μs | 1.0631μs | 940.6130 KOps/s | 915.4568 KOps/s | $\color{#35bf28}+2.75\\%$ | | test_unbind_speed | 1.7892ms | 0.2558ms | 3.9091 KOps/s | 4.0649 KOps/s | $\color{#d91a1a}-3.83\\%$ | | test_unbind_speed_stack0 | 0.3262ms | 0.2426ms | 4.1223 KOps/s | 4.0950 KOps/s | $\color{#35bf28}+0.67\\%$ | | test_unbind_speed_stack1 | 80.1716ms | 0.7213ms | 1.3864 KOps/s | 1.3522 KOps/s | $\color{#35bf28}+2.53\\%$ | | test_split | 78.6365ms | 1.6275ms | 614.4520 Ops/s | 611.8575 Ops/s | $\color{#35bf28}+0.42\\%$ | | test_chunk | 79.8825ms | 1.6682ms | 599.4636 Ops/s | 591.5005 Ops/s | $\color{#35bf28}+1.35\\%$ | | test_creation[device0] | 3.9867ms | 86.9954μs | 11.4949 KOps/s | 11.6291 KOps/s | $\color{#d91a1a}-1.15\\%$ | | test_creation_from_tensor | 0.4218ms | 86.5545μs | 11.5534 KOps/s | 11.4282 KOps/s | $\color{#35bf28}+1.10\\%$ | | test_add_one[memmap_tensor0] | 0.1142ms | 5.2540μs | 190.3308 KOps/s | 186.8112 KOps/s | $\color{#35bf28}+1.88\\%$ | | test_contiguous[memmap_tensor0] | 14.0960μs | 0.6360μs | 1.5724 MOps/s | 1.5830 MOps/s | $\color{#d91a1a}-0.67\\%$ | | test_stack[memmap_tensor0] | 29.3250μs | 3.5296μs | 283.3182 KOps/s | 280.4247 KOps/s | $\color{#35bf28}+1.03\\%$ | | test_memmaptd_index | 1.3858ms | 0.2600ms | 3.8468 KOps/s | 3.9232 KOps/s | $\color{#d91a1a}-1.95\\%$ | | test_memmaptd_index_astensor | 0.7701ms | 0.3310ms | 3.0211 KOps/s | 3.0065 KOps/s | $\color{#35bf28}+0.48\\%$ | | test_memmaptd_index_op | 1.2096ms | 0.6180ms | 1.6181 KOps/s | 1.7360 KOps/s | $\textbf{\color{#d91a1a}-6.79\\%}$ | | test_serialize_model | 0.1815s | 0.1086s | 9.2047 Ops/s | 8.6872 Ops/s | $\textbf{\color{#35bf28}+5.96\\%}$ | | test_serialize_model_pickle | 0.4510s | 0.3825s | 2.6143 Ops/s | 2.6671 Ops/s | $\color{#d91a1a}-1.98\\%$ | | test_serialize_weights | 0.1053s | 0.1007s | 9.9332 Ops/s | 10.0519 Ops/s | $\color{#d91a1a}-1.18\\%$ | | test_serialize_weights_returnearly | 0.1997s | 0.1304s | 7.6664 Ops/s | 8.0284 Ops/s | $\color{#d91a1a}-4.51\\%$ | | test_serialize_weights_pickle | 0.8967s | 0.6095s | 1.6407 Ops/s | 2.4708 Ops/s | $\textbf{\color{#d91a1a}-33.60\\%}$ | | test_serialize_weights_filesystem | 98.2693ms | 93.9948ms | 10.6389 Ops/s | 10.0972 Ops/s | $\textbf{\color{#35bf28}+5.37\\%}$ | | test_serialize_model_filesystem | 0.1033s | 95.2308ms | 10.5008 Ops/s | 9.1142 Ops/s | $\textbf{\color{#35bf28}+15.21\\%}$ | | test_reshape_pytree | 62.4770μs | 25.8629μs | 38.6654 KOps/s | 38.7562 KOps/s | $\color{#d91a1a}-0.23\\%$ | | test_reshape_td | 0.1022ms | 33.6953μs | 29.6777 KOps/s | 29.5964 KOps/s | $\color{#35bf28}+0.27\\%$ | | test_view_pytree | 66.9160μs | 25.7784μs | 38.7921 KOps/s | 39.2944 KOps/s | $\color{#d91a1a}-1.28\\%$ | | test_view_td | 87.5840μs | 38.7494μs | 25.8069 KOps/s | 25.5258 KOps/s | $\color{#35bf28}+1.10\\%$ | | test_unbind_pytree | 60.9850μs | 29.6387μs | 33.7397 KOps/s | 34.2231 KOps/s | $\color{#d91a1a}-1.41\\%$ | | test_unbind_td | 0.4460ms | 36.5309μs | 27.3741 KOps/s | 27.7361 KOps/s | $\color{#d91a1a}-1.31\\%$ | | test_split_pytree | 81.1020μs | 30.0356μs | 33.2939 KOps/s | 33.6495 KOps/s | $\color{#d91a1a}-1.06\\%$ | | test_split_td | 0.1259ms | 39.6551μs | 25.2174 KOps/s | 25.2330 KOps/s | $\color{#d91a1a}-0.06\\%$ | | test_add_pytree | 73.7380μs | 35.1559μs | 28.4447 KOps/s | 28.7098 KOps/s | $\color{#d91a1a}-0.92\\%$ | | test_add_td | 0.1254ms | 54.7245μs | 18.2734 KOps/s | 20.1343 KOps/s | $\textbf{\color{#d91a1a}-9.24\\%}$ | | test_distributed | 0.2143ms | 0.1033ms | 9.6764 KOps/s | 9.5254 KOps/s | $\color{#35bf28}+1.59\\%$ | | test_tdmodule | 87.6640μs | 18.1105μs | 55.2166 KOps/s | 58.7234 KOps/s | $\textbf{\color{#d91a1a}-5.97\\%}$ | | test_tdmodule_dispatch | 64.6110μs | 35.8488μs | 27.8950 KOps/s | 30.1762 KOps/s | $\textbf{\color{#d91a1a}-7.56\\%}$ | | test_tdseq | 36.0470μs | 21.3289μs | 46.8847 KOps/s | 51.4446 KOps/s | $\textbf{\color{#d91a1a}-8.86\\%}$ | | test_tdseq_dispatch | 84.7890μs | 42.7518μs | 23.3908 KOps/s | 26.4961 KOps/s | $\textbf{\color{#d91a1a}-11.72\\%}$ | | test_instantiation_functorch | 1.8233ms | 1.3153ms | 760.2891 Ops/s | 757.9602 Ops/s | $\color{#35bf28}+0.31\\%$ | | test_instantiation_td | 2.0263ms | 1.0211ms | 979.3751 Ops/s | 985.2984 Ops/s | $\color{#d91a1a}-0.60\\%$ | | test_exec_functorch | 0.2928ms | 0.1628ms | 6.1412 KOps/s | 6.0225 KOps/s | $\color{#35bf28}+1.97\\%$ | | test_exec_functional_call | 0.8753ms | 0.1582ms | 6.3230 KOps/s | 6.6014 KOps/s | $\color{#d91a1a}-4.22\\%$ | | test_exec_td | 0.2794ms | 0.1486ms | 6.7308 KOps/s | 6.6925 KOps/s | $\color{#35bf28}+0.57\\%$ | | test_exec_td_decorator | 0.9285ms | 0.2263ms | 4.4191 KOps/s | 4.3949 KOps/s | $\color{#35bf28}+0.55\\%$ | | test_vmap_mlp_speed[True-True] | 0.6446ms | 0.4839ms | 2.0664 KOps/s | 2.0336 KOps/s | $\color{#35bf28}+1.61\\%$ | | test_vmap_mlp_speed[True-False] | 0.6228ms | 0.4824ms | 2.0728 KOps/s | 2.0686 KOps/s | $\color{#35bf28}+0.20\\%$ | | test_vmap_mlp_speed[False-True] | 0.6401ms | 0.3936ms | 2.5405 KOps/s | 2.4980 KOps/s | $\color{#35bf28}+1.70\\%$ | | test_vmap_mlp_speed[False-False] | 0.5367ms | 0.3926ms | 2.5474 KOps/s | 2.5189 KOps/s | $\color{#35bf28}+1.13\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.1908ms | 0.5644ms | 1.7719 KOps/s | 1.7933 KOps/s | $\color{#d91a1a}-1.19\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.8452ms | 0.5569ms | 1.7956 KOps/s | 1.8118 KOps/s | $\color{#d91a1a}-0.89\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8087ms | 0.4593ms | 2.1771 KOps/s | 2.1819 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8107ms | 0.4602ms | 2.1730 KOps/s | 2.1446 KOps/s | $\color{#35bf28}+1.33\\%$ | | test_to_module_speed[True] | 3.4995ms | 1.7264ms | 579.2266 Ops/s | 587.6549 Ops/s | $\color{#d91a1a}-1.43\\%$ | | test_to_module_speed[False] | 1.9225ms | 1.7021ms | 587.5222 Ops/s | 597.5226 Ops/s | $\color{#d91a1a}-1.67\\%$ | | test_tc_init | 0.1140ms | 54.5429μs | 18.3342 KOps/s | 19.2815 KOps/s | $\color{#d91a1a}-4.91\\%$ | | test_tc_init_nested | 0.2116ms | 0.1115ms | 8.9664 KOps/s | 10.0286 KOps/s | $\textbf{\color{#d91a1a}-10.59\\%}$ | | test_tc_first_layer_tensor | 34.5850μs | 8.5122μs | 117.4789 KOps/s | 120.2152 KOps/s | $\color{#d91a1a}-2.28\\%$ | | test_tc_first_layer_nontensor | 35.8280μs | 8.5780μs | 116.5771 KOps/s | 121.4015 KOps/s | $\color{#d91a1a}-3.97\\%$ | | test_tc_second_layer_tensor | 29.7050μs | 2.5887μs | 386.2896 KOps/s | 397.7623 KOps/s | $\color{#d91a1a}-2.88\\%$ | | test_tc_second_layer_nontensor | 41.6580μs | 9.5620μs | 104.5809 KOps/s | 107.1160 KOps/s | $\color{#d91a1a}-2.37\\%$ | | test_unbind | 89.3256ms | 14.9400ms | 66.9342 Ops/s | 61.0002 Ops/s | $\textbf{\color{#35bf28}+9.73\\%}$ | | test_full_like | 15.9554ms | 12.6801ms | 78.8640 Ops/s | 70.4421 Ops/s | $\textbf{\color{#35bf28}+11.96\\%}$ | | test_zeros_like | 13.0824ms | 6.6688ms | 149.9524 Ops/s | 149.5659 Ops/s | $\color{#35bf28}+0.26\\%$ | | test_ones_like | 13.0072ms | 7.1384ms | 140.0881 Ops/s | 137.2055 Ops/s | $\color{#35bf28}+2.10\\%$ | | test_clone | 12.6697ms | 9.3429ms | 107.0330 Ops/s | 102.9261 Ops/s | $\color{#35bf28}+3.99\\%$ | | test_squeeze | 76.5230μs | 12.8636μs | 77.7385 KOps/s | 79.4915 KOps/s | $\color{#d91a1a}-2.21\\%$ | | test_unsqueeze | 0.1934ms | 99.5863μs | 10.0415 KOps/s | 10.1390 KOps/s | $\color{#d91a1a}-0.96\\%$ | | test_split | 0.5608ms | 0.2771ms | 3.6083 KOps/s | 3.4945 KOps/s | $\color{#35bf28}+3.26\\%$ | | test_permute | 0.4537ms | 0.2236ms | 4.4719 KOps/s | 4.4105 KOps/s | $\color{#35bf28}+1.39\\%$ | | test_stack | 35.5322ms | 24.5878ms | 40.6706 Ops/s | 37.7817 Ops/s | $\textbf{\color{#35bf28}+7.65\\%}$ | | test_cat | 26.4019ms | 23.9361ms | 41.7779 Ops/s | 39.4699 Ops/s | $\textbf{\color{#35bf28}+5.85\\%}$ |