pytorch / tensordict

TensorDict is a pytorch dedicated tensor container.
MIT License
832 stars 74 forks source link

[Doc] Fix symbolic trace reference in doc #918

Closed vmoens closed 3 months ago

github-actions[bot] commented 3 months ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 213. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 52.9700μs | 22.1158μs | 45.2165 KOps/s | 47.2252 KOps/s | $\color{#d91a1a}-4.25\\%$ | | test_plain_set_stack_nested | 52.2370μs | 22.0031μs | 45.4482 KOps/s | 46.1818 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_plain_set_nested_inplace | 59.4720μs | 24.0309μs | 41.6131 KOps/s | 42.6284 KOps/s | $\color{#d91a1a}-2.38\\%$ | | test_plain_set_stack_nested_inplace | 55.9350μs | 24.0007μs | 41.6655 KOps/s | 42.9632 KOps/s | $\color{#d91a1a}-3.02\\%$ | | test_items | 24.5060μs | 2.6302μs | 380.2051 KOps/s | 351.5449 KOps/s | $\textbf{\color{#35bf28}+8.15\\%}$ | | test_items_nested | 0.3944ms | 0.3358ms | 2.9779 KOps/s | 2.9330 KOps/s | $\color{#35bf28}+1.53\\%$ | | test_items_nested_locked | 0.6271ms | 0.3375ms | 2.9626 KOps/s | 2.9358 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_items_nested_leaf | 0.1469ms | 87.5532μs | 11.4216 KOps/s | 11.8274 KOps/s | $\color{#d91a1a}-3.43\\%$ | | test_items_stack_nested | 2.2591ms | 0.3387ms | 2.9528 KOps/s | 2.9478 KOps/s | $\color{#35bf28}+0.17\\%$ | | test_items_stack_nested_leaf | 0.1952ms | 88.6204μs | 11.2841 KOps/s | 12.0055 KOps/s | $\textbf{\color{#d91a1a}-6.01\\%}$ | | test_items_stack_nested_locked | 0.4251ms | 0.3405ms | 2.9370 KOps/s | 2.9313 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_keys | 24.1850μs | 3.8846μs | 257.4242 KOps/s | 246.4953 KOps/s | $\color{#35bf28}+4.43\\%$ | | test_keys_nested | 0.2535ms | 0.1442ms | 6.9333 KOps/s | 6.8009 KOps/s | $\color{#35bf28}+1.95\\%$ | | test_keys_nested_locked | 0.7407ms | 0.1498ms | 6.6742 KOps/s | 6.5520 KOps/s | $\color{#35bf28}+1.86\\%$ | | test_keys_nested_leaf | 0.2109ms | 0.1248ms | 8.0145 KOps/s | 7.8943 KOps/s | $\color{#35bf28}+1.52\\%$ | | test_keys_stack_nested | 0.2425ms | 0.1437ms | 6.9597 KOps/s | 6.9470 KOps/s | $\color{#35bf28}+0.18\\%$ | | test_keys_stack_nested_leaf | 0.2118ms | 0.1221ms | 8.1911 KOps/s | 8.0394 KOps/s | $\color{#35bf28}+1.89\\%$ | | test_keys_stack_nested_locked | 0.3974ms | 0.1486ms | 6.7294 KOps/s | 6.6983 KOps/s | $\color{#35bf28}+0.46\\%$ | | test_values | 7.9800μs | 1.0575μs | 945.6168 KOps/s | 848.0727 KOps/s | $\textbf{\color{#35bf28}+11.50\\%}$ | | test_values_nested | 95.8100μs | 50.4398μs | 19.8256 KOps/s | 19.6433 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_values_nested_locked | 0.1001ms | 49.8534μs | 20.0588 KOps/s | 19.5741 KOps/s | $\color{#35bf28}+2.48\\%$ | | test_values_nested_leaf | 82.9560μs | 45.0591μs | 22.1931 KOps/s | 21.8215 KOps/s | $\color{#35bf28}+1.70\\%$ | | test_values_stack_nested | 0.1791ms | 51.4819μs | 19.4243 KOps/s | 18.9848 KOps/s | $\color{#35bf28}+2.31\\%$ | | test_values_stack_nested_leaf | 89.2770μs | 44.9128μs | 22.2654 KOps/s | 22.2126 KOps/s | $\color{#35bf28}+0.24\\%$ | | test_values_stack_nested_locked | 0.1001ms | 51.1286μs | 19.5585 KOps/s | 19.3600 KOps/s | $\color{#35bf28}+1.03\\%$ | | test_membership | 2.8734μs | 0.7346μs | 1.3613 MOps/s | 1.0795 MOps/s | $\textbf{\color{#35bf28}+26.10\\%}$ | | test_membership_nested | 25.0670μs | 2.5883μs | 386.3576 KOps/s | 376.0086 KOps/s | $\color{#35bf28}+2.75\\%$ | | test_membership_nested_leaf | 19.2460μs | 2.5929μs | 385.6724 KOps/s | 370.0933 KOps/s | $\color{#35bf28}+4.21\\%$ | | test_membership_stacked_nested | 30.1760μs | 2.6464μs | 377.8670 KOps/s | 376.3419 KOps/s | $\color{#35bf28}+0.41\\%$ | | test_membership_stacked_nested_leaf | 29.1640μs | 2.6361μs | 379.3480 KOps/s | 379.4488 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_membership_nested_last | 27.6320μs | 3.9674μs | 252.0566 KOps/s | 250.3802 KOps/s | $\color{#35bf28}+0.67\\%$ | | test_membership_nested_leaf_last | 39.4540μs | 4.0017μs | 249.8911 KOps/s | 251.1824 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_membership_stacked_nested_last | 53.5600μs | 9.3908μs | 106.4870 KOps/s | 78.4726 KOps/s | $\textbf{\color{#35bf28}+35.70\\%}$ | | test_membership_stacked_nested_leaf_last | 41.5280μs | 9.3893μs | 106.5038 KOps/s | 78.8362 KOps/s | $\textbf{\color{#35bf28}+35.10\\%}$ | | test_nested_getleaf | 39.7340μs | 10.6912μs | 93.5350 KOps/s | 92.3845 KOps/s | $\color{#35bf28}+1.25\\%$ | | test_nested_get | 35.4770μs | 10.1605μs | 98.4203 KOps/s | 99.7919 KOps/s | $\color{#d91a1a}-1.37\\%$ | | test_stacked_getleaf | 37.5200μs | 10.3813μs | 96.3268 KOps/s | 95.9154 KOps/s | $\color{#35bf28}+0.43\\%$ | | test_stacked_get | 30.1470μs | 9.9412μs | 100.5920 KOps/s | 102.1613 KOps/s | $\color{#d91a1a}-1.54\\%$ | | test_nested_getitemleaf | 35.5160μs | 11.0956μs | 90.1257 KOps/s | 89.9219 KOps/s | $\color{#35bf28}+0.23\\%$ | | test_nested_getitem | 40.9570μs | 10.1977μs | 98.0618 KOps/s | 98.6190 KOps/s | $\color{#d91a1a}-0.56\\%$ | | test_stacked_getitemleaf | 40.5160μs | 11.0911μs | 90.1622 KOps/s | 91.0164 KOps/s | $\color{#d91a1a}-0.94\\%$ | | test_stacked_getitem | 47.7600μs | 10.2872μs | 97.2079 KOps/s | 99.1878 KOps/s | $\color{#d91a1a}-2.00\\%$ | | test_lock_nested | 6.9129ms | 0.5005ms | 1.9979 KOps/s | 1.9710 KOps/s | $\color{#35bf28}+1.37\\%$ | | test_lock_stack_nested | 0.6915ms | 0.4519ms | 2.2128 KOps/s | 2.2487 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_unlock_nested | 90.7510ms | 0.5054ms | 1.9785 KOps/s | 2.3734 KOps/s | $\textbf{\color{#d91a1a}-16.64\\%}$ | | test_unlock_stack_nested | 0.4581ms | 0.3662ms | 2.7304 KOps/s | 2.7773 KOps/s | $\color{#d91a1a}-1.69\\%$ | | test_flatten_speed | 0.6252ms | 0.1068ms | 9.3650 KOps/s | 9.6922 KOps/s | $\color{#d91a1a}-3.38\\%$ | | test_unflatten_speed | 0.9773ms | 0.4318ms | 2.3157 KOps/s | 2.3215 KOps/s | $\color{#d91a1a}-0.25\\%$ | | test_common_ops | 4.2508ms | 1.0793ms | 926.5096 Ops/s | 934.7374 Ops/s | $\color{#d91a1a}-0.88\\%$ | | test_creation | 14.3370μs | 2.0328μs | 491.9365 KOps/s | 495.3525 KOps/s | $\color{#d91a1a}-0.69\\%$ | | test_creation_empty | 48.9010μs | 18.0492μs | 55.4041 KOps/s | 57.7402 KOps/s | $\color{#d91a1a}-4.05\\%$ | | test_creation_nested_1 | 60.7740μs | 21.3067μs | 46.9336 KOps/s | 49.1052 KOps/s | $\color{#d91a1a}-4.42\\%$ | | test_creation_nested_2 | 85.9110μs | 24.6867μs | 40.5077 KOps/s | 42.1069 KOps/s | $\color{#d91a1a}-3.80\\%$ | | test_clone | 0.1125ms | 17.4473μs | 57.3154 KOps/s | 59.5533 KOps/s | $\color{#d91a1a}-3.76\\%$ | | test_getitem[int] | 1.1822ms | 16.8432μs | 59.3712 KOps/s | 61.5940 KOps/s | $\color{#d91a1a}-3.61\\%$ | | test_getitem[slice_int] | 0.1273ms | 31.5690μs | 31.6766 KOps/s | 33.2282 KOps/s | $\color{#d91a1a}-4.67\\%$ | | test_getitem[range] | 0.3301ms | 58.2921μs | 17.1550 KOps/s | 17.7949 KOps/s | $\color{#d91a1a}-3.60\\%$ | | test_getitem[tuple] | 0.1498ms | 26.9893μs | 37.0518 KOps/s | 40.4465 KOps/s | $\textbf{\color{#d91a1a}-8.39\\%}$ | | test_getitem[list] | 0.2422ms | 53.0066μs | 18.8656 KOps/s | 19.0406 KOps/s | $\color{#d91a1a}-0.92\\%$ | | test_setitem_dim[int] | 74.5300μs | 39.4771μs | 25.3311 KOps/s | 25.4885 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_setitem_dim[slice_int] | 0.1312ms | 70.1076μs | 14.2638 KOps/s | 14.5780 KOps/s | $\color{#d91a1a}-2.16\\%$ | | test_setitem_dim[range] | 0.1324ms | 93.3085μs | 10.7171 KOps/s | 10.7995 KOps/s | $\color{#d91a1a}-0.76\\%$ | | test_setitem_dim[tuple] | 99.7570μs | 57.0539μs | 17.5273 KOps/s | 17.7904 KOps/s | $\color{#d91a1a}-1.48\\%$ | | test_setitem | 0.1022ms | 29.3254μs | 34.1002 KOps/s | 35.6094 KOps/s | $\color{#d91a1a}-4.24\\%$ | | test_set | 0.1087ms | 28.4998μs | 35.0880 KOps/s | 36.5836 KOps/s | $\color{#d91a1a}-4.09\\%$ | | test_set_shared | 3.9517ms | 0.2162ms | 4.6259 KOps/s | 4.6743 KOps/s | $\color{#d91a1a}-1.04\\%$ | | test_update | 0.1810ms | 35.0716μs | 28.5131 KOps/s | 29.4769 KOps/s | $\color{#d91a1a}-3.27\\%$ | | test_update_nested | 0.1462ms | 44.8943μs | 22.2745 KOps/s | 22.6820 KOps/s | $\color{#d91a1a}-1.80\\%$ | | test_update__nested | 0.1358ms | 34.7162μs | 28.8050 KOps/s | 29.2684 KOps/s | $\color{#d91a1a}-1.58\\%$ | | test_set_nested | 0.1166ms | 30.9885μs | 32.2700 KOps/s | 33.6298 KOps/s | $\color{#d91a1a}-4.04\\%$ | | test_set_nested_new | 0.1576ms | 35.7146μs | 27.9998 KOps/s | 29.0568 KOps/s | $\color{#d91a1a}-3.64\\%$ | | test_select | 0.2074ms | 52.8401μs | 18.9250 KOps/s | 19.6765 KOps/s | $\color{#d91a1a}-3.82\\%$ | | test_select_nested | 0.1130ms | 59.0501μs | 16.9348 KOps/s | 16.7791 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_exclude_nested | 0.1451ms | 76.7117μs | 13.0358 KOps/s | 12.6885 KOps/s | $\color{#35bf28}+2.74\\%$ | | test_empty[True] | 0.5547ms | 0.3199ms | 3.1255 KOps/s | 3.1130 KOps/s | $\color{#35bf28}+0.40\\%$ | | test_empty[False] | 9.4196μs | 1.1454μs | 873.0875 KOps/s | 844.7930 KOps/s | $\color{#35bf28}+3.35\\%$ | | test_unbind_speed | 0.4993ms | 0.2993ms | 3.3411 KOps/s | 3.2071 KOps/s | $\color{#35bf28}+4.18\\%$ | | test_unbind_speed_stack0 | 0.4263ms | 0.2937ms | 3.4053 KOps/s | 3.4411 KOps/s | $\color{#d91a1a}-1.04\\%$ | | test_unbind_speed_stack1 | 0.1044s | 0.7895ms | 1.2667 KOps/s | 1.4069 KOps/s | $\textbf{\color{#d91a1a}-9.96\\%}$ | | test_split | 0.1049s | 2.2331ms | 447.8059 Ops/s | 471.7569 Ops/s | $\textbf{\color{#d91a1a}-5.08\\%}$ | | test_chunk | 0.1014s | 2.2241ms | 449.6191 Ops/s | 469.4144 Ops/s | $\color{#d91a1a}-4.22\\%$ | | test_creation[device0] | 0.2211ms | 0.1186ms | 8.4290 KOps/s | 8.4419 KOps/s | $\color{#d91a1a}-0.15\\%$ | | test_creation_from_tensor | 4.9286ms | 0.1210ms | 8.2622 KOps/s | 8.2306 KOps/s | $\color{#35bf28}+0.38\\%$ | | test_add_one[memmap_tensor0] | 0.2570ms | 7.7462μs | 129.0959 KOps/s | 128.5209 KOps/s | $\color{#35bf28}+0.45\\%$ | | test_contiguous[memmap_tensor0] | 43.9830μs | 2.0226μs | 494.4063 KOps/s | 491.6332 KOps/s | $\color{#35bf28}+0.56\\%$ | | test_stack[memmap_tensor0] | 70.3320μs | 5.7930μs | 172.6228 KOps/s | 175.4322 KOps/s | $\color{#d91a1a}-1.60\\%$ | | test_memmaptd_index | 1.0691ms | 0.4090ms | 2.4451 KOps/s | 2.4588 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_memmaptd_index_astensor | 1.1342ms | 0.4878ms | 2.0498 KOps/s | 2.0639 KOps/s | $\color{#d91a1a}-0.68\\%$ | | test_memmaptd_index_op | 1.4231ms | 1.0345ms | 966.6693 Ops/s | 976.7053 Ops/s | $\color{#d91a1a}-1.03\\%$ | | test_serialize_model | 0.1330s | 0.1279s | 7.8168 Ops/s | 6.8433 Ops/s | $\textbf{\color{#35bf28}+14.23\\%}$ | | test_serialize_model_pickle | 0.5019s | 0.4056s | 2.4658 Ops/s | 2.4424 Ops/s | $\color{#35bf28}+0.96\\%$ | | test_serialize_weights | 0.2150s | 0.1397s | 7.1583 Ops/s | 7.8904 Ops/s | $\textbf{\color{#d91a1a}-9.28\\%}$ | | test_serialize_weights_returnearly | 0.1865s | 0.1705s | 5.8650 Ops/s | 5.9274 Ops/s | $\color{#d91a1a}-1.05\\%$ | | test_serialize_weights_pickle | 0.4900s | 0.3895s | 2.5677 Ops/s | 2.5389 Ops/s | $\color{#35bf28}+1.13\\%$ | | test_serialize_weights_filesystem | 0.1513s | 0.1429s | 6.9982 Ops/s | 6.5161 Ops/s | $\textbf{\color{#35bf28}+7.40\\%}$ | | test_serialize_model_filesystem | 0.2293s | 0.1648s | 6.0686 Ops/s | 6.4338 Ops/s | $\textbf{\color{#d91a1a}-5.68\\%}$ | | test_reshape_pytree | 89.2880μs | 39.5368μs | 25.2929 KOps/s | 25.4949 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_reshape_td | 0.1052ms | 46.1955μs | 21.6471 KOps/s | 21.8086 KOps/s | $\color{#d91a1a}-0.74\\%$ | | test_view_pytree | 0.1025ms | 39.7239μs | 25.1738 KOps/s | 25.4968 KOps/s | $\color{#d91a1a}-1.27\\%$ | | test_view_td | 0.1138ms | 52.5226μs | 19.0394 KOps/s | 19.0752 KOps/s | $\color{#d91a1a}-0.19\\%$ | | test_unbind_pytree | 85.5610μs | 37.3614μs | 26.7656 KOps/s | 27.3514 KOps/s | $\color{#d91a1a}-2.14\\%$ | | test_unbind_td | 0.4251ms | 45.3209μs | 22.0649 KOps/s | 21.9813 KOps/s | $\color{#35bf28}+0.38\\%$ | | test_split_pytree | 0.1007ms | 39.7820μs | 25.1370 KOps/s | 25.5432 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_split_td | 0.5572ms | 58.5116μs | 17.0906 KOps/s | 17.6592 KOps/s | $\color{#d91a1a}-3.22\\%$ | | test_add_pytree | 98.3650μs | 45.9463μs | 21.7645 KOps/s | 21.5666 KOps/s | $\color{#35bf28}+0.92\\%$ | | test_add_td | 0.2216ms | 80.8699μs | 12.3655 KOps/s | 12.6337 KOps/s | $\color{#d91a1a}-2.12\\%$ | | test_compile_add_one_nested[tensordict-compile] | 0.1030ms | 53.6979μs | 18.6227 KOps/s | 18.0114 KOps/s | $\color{#35bf28}+3.39\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.4021ms | 0.1862ms | 5.3704 KOps/s | 5.3822 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_compile_add_one_nested[pytree-compile] | 0.2012ms | 54.3950μs | 18.3840 KOps/s | 18.2948 KOps/s | $\color{#35bf28}+0.49\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.3016ms | 0.1459ms | 6.8540 KOps/s | 6.8541 KOps/s | $-0.00\\%$ | | test_compile_copy_nested[tensordict-compile] | 50.5440μs | 20.4045μs | 49.0088 KOps/s | 46.5352 KOps/s | $\textbf{\color{#35bf28}+5.32\\%}$ | | test_compile_copy_nested[tensordict-eager] | 0.1701ms | 62.9611μs | 15.8828 KOps/s | 15.6524 KOps/s | $\color{#35bf28}+1.47\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1711ms | 79.0419μs | 12.6515 KOps/s | 12.6631 KOps/s | $\color{#d91a1a}-0.09\\%$ | | test_compile_copy_nested[pytree-eager] | 0.1328ms | 71.7303μs | 13.9411 KOps/s | 13.8659 KOps/s | $\color{#35bf28}+0.54\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.2874ms | 0.1733ms | 5.7706 KOps/s | 5.7752 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.4044ms | 0.1954ms | 5.1182 KOps/s | 5.1377 KOps/s | $\color{#d91a1a}-0.38\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.1215ms | 39.2218μs | 25.4961 KOps/s | 24.5270 KOps/s | $\color{#35bf28}+3.95\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 1.4921ms | 68.1786μs | 14.6674 KOps/s | 14.6508 KOps/s | $\color{#35bf28}+0.11\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.2794ms | 0.1721ms | 5.8096 KOps/s | 5.7512 KOps/s | $\color{#35bf28}+1.02\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.5083ms | 0.3023ms | 3.3081 KOps/s | 3.3419 KOps/s | $\color{#d91a1a}-1.01\\%$ | | test_compile_add_self_flat[tensordict-eager] | 0.4279ms | 0.2088ms | 4.7896 KOps/s | 4.7224 KOps/s | $\color{#35bf28}+1.42\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.3302ms | 0.1830ms | 5.4635 KOps/s | 5.6755 KOps/s | $\color{#d91a1a}-3.74\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.1727ms | 62.5863μs | 15.9779 KOps/s | 16.1636 KOps/s | $\color{#d91a1a}-1.15\\%$ | | test_compile_add_self_flat[tensorclass-compile] | 82.1640μs | 39.7098μs | 25.1827 KOps/s | 24.9593 KOps/s | $\color{#35bf28}+0.89\\%$ | | test_compile_add_self_flat[pytree-eager] | 0.5319ms | 0.2469ms | 4.0494 KOps/s | 4.1269 KOps/s | $\color{#d91a1a}-1.88\\%$ | | test_compile_add_self_flat[pytree-compile] | 0.3432ms | 0.1739ms | 5.7496 KOps/s | 5.8187 KOps/s | $\color{#d91a1a}-1.19\\%$ | | test_compile_copy_flat[tensordict-compile] | 0.1965ms | 0.1076ms | 9.2899 KOps/s | 9.3818 KOps/s | $\color{#d91a1a}-0.98\\%$ | | test_compile_copy_flat[tensordict-eager] | 0.1238ms | 55.8418μs | 17.9077 KOps/s | 17.3460 KOps/s | $\color{#35bf28}+3.24\\%$ | | test_compile_copy_flat[pytree-compile] | 0.1683ms | 82.2672μs | 12.1555 KOps/s | 12.7137 KOps/s | $\color{#d91a1a}-4.39\\%$ | | test_compile_copy_flat[pytree-eager] | 0.1500ms | 71.9285μs | 13.9027 KOps/s | 13.9399 KOps/s | $\color{#d91a1a}-0.27\\%$ | | test_compile_assign_and_add[tensordict-compile] | 0.2850ms | 0.1902ms | 5.2572 KOps/s | 5.3490 KOps/s | $\color{#d91a1a}-1.72\\%$ | | test_compile_assign_and_add[tensordict-eager] | 2.8751ms | 1.8654ms | 536.0653 Ops/s | 606.5114 Ops/s | $\textbf{\color{#d91a1a}-11.61\\%}$ | | test_compile_assign_and_add[pytree-compile] | 0.4238ms | 0.1910ms | 5.2367 KOps/s | 5.2480 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_compile_assign_and_add[pytree-eager] | 1.3831ms | 1.1002ms | 908.9523 Ops/s | 906.6653 Ops/s | $\color{#35bf28}+0.25\\%$ | | test_compile_assign_and_add_stack[compile] | 0.7418ms | 0.4250ms | 2.3528 KOps/s | 2.3461 KOps/s | $\color{#35bf28}+0.29\\%$ | | test_compile_assign_and_add_stack[eager] | 4.0987ms | 3.8131ms | 262.2530 Ops/s | 269.2811 Ops/s | $\color{#d91a1a}-2.61\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 87.8940μs | 33.5780μs | 29.7814 KOps/s | 30.6657 KOps/s | $\color{#d91a1a}-2.88\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 0.7794ms | 48.9824μs | 20.4155 KOps/s | 20.7107 KOps/s | $\color{#d91a1a}-1.43\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 67.9370μs | 28.8982μs | 34.6042 KOps/s | 36.1692 KOps/s | $\color{#d91a1a}-4.33\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 83.5260μs | 29.7261μs | 33.6405 KOps/s | 34.1237 KOps/s | $\color{#d91a1a}-1.42\\%$ | | test_compile_indexing[tensor-pytree-compile] | 0.1169ms | 28.8823μs | 34.6233 KOps/s | 35.7654 KOps/s | $\color{#d91a1a}-3.19\\%$ | | test_compile_indexing[tensor-pytree-eager] | 72.2850μs | 30.0855μs | 33.2386 KOps/s | 33.6801 KOps/s | $\color{#d91a1a}-1.31\\%$ | | test_compile_indexing[slice-tensordict-compile] | 0.1600ms | 71.0117μs | 14.0822 KOps/s | 14.0836 KOps/s | $\color{#d91a1a}-0.01\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.3856ms | 27.6191μs | 36.2068 KOps/s | 37.5238 KOps/s | $\color{#d91a1a}-3.51\\%$ | | test_compile_indexing[slice-tensorclass-compile] | 0.1246ms | 66.5966μs | 15.0158 KOps/s | 14.7046 KOps/s | $\color{#35bf28}+2.12\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 75.8530μs | 25.6367μs | 39.0065 KOps/s | 42.0845 KOps/s | $\textbf{\color{#d91a1a}-7.31\\%}$ | | test_compile_indexing[slice-pytree-compile] | 0.1268ms | 67.1880μs | 14.8836 KOps/s | 14.4907 KOps/s | $\color{#35bf28}+2.71\\%$ | | test_compile_indexing[slice-pytree-eager] | 62.8180μs | 25.0780μs | 39.8755 KOps/s | 42.0978 KOps/s | $\textbf{\color{#d91a1a}-5.28\\%}$ | | test_compile_indexing[int-tensordict-compile] | 0.1574ms | 71.1025μs | 14.0642 KOps/s | 14.0073 KOps/s | $\color{#35bf28}+0.41\\%$ | | test_compile_indexing[int-tensordict-eager] | 0.7618ms | 27.6139μs | 36.2137 KOps/s | 37.4683 KOps/s | $\color{#d91a1a}-3.35\\%$ | | test_compile_indexing[int-tensorclass-compile] | 0.2610ms | 68.9139μs | 14.5109 KOps/s | 14.7203 KOps/s | $\color{#d91a1a}-1.42\\%$ | | test_compile_indexing[int-tensorclass-eager] | 79.3890μs | 24.3626μs | 41.0465 KOps/s | 42.1443 KOps/s | $\color{#d91a1a}-2.60\\%$ | | test_compile_indexing[int-pytree-compile] | 0.1357ms | 66.8305μs | 14.9632 KOps/s | 14.8066 KOps/s | $\color{#35bf28}+1.06\\%$ | | test_compile_indexing[int-pytree-eager] | 70.0710μs | 24.2757μs | 41.1934 KOps/s | 43.0249 KOps/s | $\color{#d91a1a}-4.26\\%$ | | test_mod_add[eager] | 95.9290μs | 24.0197μs | 41.6325 KOps/s | 42.1721 KOps/s | $\color{#d91a1a}-1.28\\%$ | | test_mod_add[compile] | 0.1353ms | 37.3798μs | 26.7524 KOps/s | 27.3139 KOps/s | $\color{#d91a1a}-2.06\\%$ | | test_mod_add[compile-overhead] | 0.1356ms | 36.9146μs | 27.0895 KOps/s | 26.8843 KOps/s | $\color{#35bf28}+0.76\\%$ | | test_mod_wrap[eager] | 0.3634ms | 0.2077ms | 4.8154 KOps/s | 4.6586 KOps/s | $\color{#35bf28}+3.37\\%$ | | test_mod_wrap[compile] | 1.7596ms | 0.2259ms | 4.4263 KOps/s | 4.3326 KOps/s | $\color{#35bf28}+2.16\\%$ | | test_mod_wrap[compile-overhead] | 0.4481ms | 0.2238ms | 4.4679 KOps/s | 4.3733 KOps/s | $\color{#35bf28}+2.16\\%$ | | test_mod_wrap_and_backward[eager] | 12.8100ms | 11.1978ms | 89.3031 Ops/s | 89.9365 Ops/s | $\color{#d91a1a}-0.70\\%$ | | test_mod_wrap_and_backward[compile] | 16.9214ms | 11.8675ms | 84.2638 Ops/s | 86.4230 Ops/s | $\color{#d91a1a}-2.50\\%$ | | test_mod_wrap_and_backward[compile-overhead] | 14.2113ms | 11.7661ms | 84.9900 Ops/s | 89.7048 Ops/s | $\textbf{\color{#d91a1a}-5.26\\%}$ | | test_seq_add[eager] | 0.2128ms | 85.5268μs | 11.6922 KOps/s | 11.5782 KOps/s | $\color{#35bf28}+0.98\\%$ | | test_seq_add[compile] | 0.1512ms | 59.0128μs | 16.9455 KOps/s | 16.4274 KOps/s | $\color{#35bf28}+3.15\\%$ | | test_seq_add[compile-overhead] | 0.1440ms | 58.6545μs | 17.0490 KOps/s | 16.7485 KOps/s | $\color{#35bf28}+1.79\\%$ | | test_seq_wrap[eager] | 0.5629ms | 0.3719ms | 2.6891 KOps/s | 2.6439 KOps/s | $\color{#35bf28}+1.71\\%$ | | test_seq_wrap[compile] | 0.4492ms | 0.2589ms | 3.8622 KOps/s | 3.6970 KOps/s | $\color{#35bf28}+4.47\\%$ | | test_seq_wrap[compile-overhead] | 0.4059ms | 0.2574ms | 3.8857 KOps/s | 3.6618 KOps/s | $\textbf{\color{#35bf28}+6.11\\%}$ | | test_func_call_runtime[False-eager] | 0.8773ms | 0.5225ms | 1.9138 KOps/s | 1.8025 KOps/s | $\textbf{\color{#35bf28}+6.18\\%}$ | | test_func_call_runtime[False-compile] | 0.6122ms | 0.4941ms | 2.0241 KOps/s | 1.9622 KOps/s | $\color{#35bf28}+3.15\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.6086ms | 0.4928ms | 2.0290 KOps/s | 1.9881 KOps/s | $\color{#35bf28}+2.06\\%$ | | test_func_call_runtime[True-eager] | 0.9843ms | 0.8273ms | 1.2088 KOps/s | 1.1600 KOps/s | $\color{#35bf28}+4.21\\%$ | | test_func_call_runtime[True-compile] | 0.8614ms | 0.5130ms | 1.9492 KOps/s | 1.8984 KOps/s | $\color{#35bf28}+2.68\\%$ | | test_func_call_runtime[True-compile-overhead] | 1.0233ms | 0.5146ms | 1.9434 KOps/s | 1.8944 KOps/s | $\color{#35bf28}+2.58\\%$ | | test_distributed | 0.2264ms | 0.1326ms | 7.5421 KOps/s | 7.2918 KOps/s | $\color{#35bf28}+3.43\\%$ | | test_tdmodule | 42.4100μs | 16.9791μs | 58.8960 KOps/s | 58.5985 KOps/s | $\color{#35bf28}+0.51\\%$ | | test_tdmodule_dispatch | 56.3560μs | 35.6022μs | 28.0882 KOps/s | 28.4818 KOps/s | $\color{#d91a1a}-1.38\\%$ | | test_tdseq | 41.1770μs | 18.8530μs | 53.0421 KOps/s | 53.2613 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_tdseq_dispatch | 70.0710μs | 39.4690μs | 25.3364 KOps/s | 25.8517 KOps/s | $\color{#d91a1a}-1.99\\%$ | | test_instantiation_functorch | 1.8525ms | 1.6454ms | 607.7680 Ops/s | 608.3978 Ops/s | $\color{#d91a1a}-0.10\\%$ | | test_instantiation_td | 2.1611ms | 1.1945ms | 837.1354 Ops/s | 843.7049 Ops/s | $\color{#d91a1a}-0.78\\%$ | | test_exec_functorch | 0.3202ms | 0.1783ms | 5.6088 KOps/s | 5.5133 KOps/s | $\color{#35bf28}+1.73\\%$ | | test_exec_functional_call | 0.3202ms | 0.1660ms | 6.0258 KOps/s | 5.6237 KOps/s | $\textbf{\color{#35bf28}+7.15\\%}$ | | test_exec_td | 0.2818ms | 0.1680ms | 5.9541 KOps/s | 5.6082 KOps/s | $\textbf{\color{#35bf28}+6.17\\%}$ | | test_exec_td_decorator | 0.7327ms | 0.2497ms | 4.0044 KOps/s | 3.8397 KOps/s | $\color{#35bf28}+4.29\\%$ | | test_vmap_mlp_speed[True-True] | 0.8348ms | 0.5991ms | 1.6692 KOps/s | 1.6684 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_vmap_mlp_speed[True-False] | 1.0067ms | 0.5966ms | 1.6762 KOps/s | 1.6745 KOps/s | $\color{#35bf28}+0.10\\%$ | | test_vmap_mlp_speed[False-True] | 0.8839ms | 0.4976ms | 2.0095 KOps/s | 2.0285 KOps/s | $\color{#d91a1a}-0.94\\%$ | | test_vmap_mlp_speed[False-False] | 0.7869ms | 0.5005ms | 1.9979 KOps/s | 2.0013 KOps/s | $\color{#d91a1a}-0.17\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.7719ms | 0.7280ms | 1.3736 KOps/s | 1.4454 KOps/s | $\color{#d91a1a}-4.97\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 1.0276ms | 0.6939ms | 1.4411 KOps/s | 1.4475 KOps/s | $\color{#d91a1a}-0.44\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8795ms | 0.5736ms | 1.7434 KOps/s | 1.7433 KOps/s | $+0.00\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.9994ms | 0.5776ms | 1.7314 KOps/s | 1.7176 KOps/s | $\color{#35bf28}+0.80\\%$ | | test_to_module_speed[True] | 2.0803ms | 1.8014ms | 555.1143 Ops/s | 554.7924 Ops/s | $\color{#35bf28}+0.06\\%$ | | test_to_module_speed[False] | 2.8209ms | 1.7785ms | 562.2565 Ops/s | 567.6108 Ops/s | $\color{#d91a1a}-0.94\\%$ | | test_tc_init | 83.5860μs | 45.2805μs | 22.0846 KOps/s | 22.1062 KOps/s | $\color{#d91a1a}-0.10\\%$ | | test_tc_init_nested | 0.1571ms | 91.2672μs | 10.9568 KOps/s | 10.8568 KOps/s | $\color{#35bf28}+0.92\\%$ | | test_tc_first_layer_tensor | 13.1750μs | 1.4864μs | 672.7507 KOps/s | 680.9036 KOps/s | $\color{#d91a1a}-1.20\\%$ | | test_tc_first_layer_nontensor | 17.8530μs | 4.3051μs | 232.2851 KOps/s | 236.6953 KOps/s | $\color{#d91a1a}-1.86\\%$ | | test_tc_second_layer_tensor | 37.2990μs | 2.8792μs | 347.3144 KOps/s | 367.5702 KOps/s | $\textbf{\color{#d91a1a}-5.51\\%}$ | | test_tc_second_layer_nontensor | 35.0360μs | 5.7075μs | 175.2072 KOps/s | 180.1781 KOps/s | $\color{#d91a1a}-2.76\\%$ | | test_unbind | 0.4567s | 14.3209ms | 69.8282 Ops/s | 68.5868 Ops/s | $\color{#35bf28}+1.81\\%$ | | test_full_like | 19.7278ms | 12.4876ms | 80.0796 Ops/s | 74.7684 Ops/s | $\textbf{\color{#35bf28}+7.10\\%}$ | | test_zeros_like | 15.2120ms | 7.6970ms | 129.9211 Ops/s | 131.6680 Ops/s | $\color{#d91a1a}-1.33\\%$ | | test_ones_like | 14.0503ms | 7.5477ms | 132.4905 Ops/s | 126.4428 Ops/s | $\color{#35bf28}+4.78\\%$ | | test_clone | 14.9428ms | 9.0678ms | 110.2803 Ops/s | 106.4526 Ops/s | $\color{#35bf28}+3.60\\%$ | | test_squeeze | 84.4880μs | 12.9782μs | 77.0524 KOps/s | 76.9814 KOps/s | $\color{#35bf28}+0.09\\%$ | | test_unsqueeze | 0.1674ms | 91.5177μs | 10.9268 KOps/s | 10.5888 KOps/s | $\color{#35bf28}+3.19\\%$ | | test_split | 0.4673ms | 0.1997ms | 5.0072 KOps/s | 5.0264 KOps/s | $\color{#d91a1a}-0.38\\%$ | | test_permute | 0.4417ms | 0.2167ms | 4.6154 KOps/s | 4.5501 KOps/s | $\color{#35bf28}+1.43\\%$ | | test_stack | 32.4954ms | 25.3688ms | 39.4185 Ops/s | 38.7712 Ops/s | $\color{#35bf28}+1.67\\%$ | | test_cat | 30.7810ms | 25.1222ms | 39.8055 Ops/s | 39.0688 Ops/s | $\color{#35bf28}+1.89\\%$ |
github-actions[bot] commented 3 months ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}33$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | -------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 0.1533ms | 17.4666μs | 57.2520 KOps/s | 58.7191 KOps/s | $\color{#d91a1a}-2.50\\%$ | | test_plain_set_stack_nested | 37.5410μs | 17.8379μs | 56.0603 KOps/s | 58.4477 KOps/s | $\color{#d91a1a}-4.08\\%$ | | test_plain_set_nested_inplace | 0.2011ms | 18.7289μs | 53.3934 KOps/s | 54.8116 KOps/s | $\color{#d91a1a}-2.59\\%$ | | test_plain_set_stack_nested_inplace | 41.7600μs | 18.9113μs | 52.8785 KOps/s | 54.7737 KOps/s | $\color{#d91a1a}-3.46\\%$ | | test_items | 22.3010μs | 4.7489μs | 210.5754 KOps/s | 211.3709 KOps/s | $\color{#d91a1a}-0.38\\%$ | | test_items_nested | 0.4384ms | 0.3602ms | 2.7761 KOps/s | 2.7041 KOps/s | $\color{#35bf28}+2.66\\%$ | | test_items_nested_locked | 0.4167ms | 0.3663ms | 2.7303 KOps/s | 2.6952 KOps/s | $\color{#35bf28}+1.30\\%$ | | test_items_nested_leaf | 0.1552ms | 84.4093μs | 11.8470 KOps/s | 11.8102 KOps/s | $\color{#35bf28}+0.31\\%$ | | test_items_stack_nested | 0.4568ms | 0.3667ms | 2.7268 KOps/s | 2.7288 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_items_stack_nested_leaf | 0.1142ms | 84.4103μs | 11.8469 KOps/s | 11.8024 KOps/s | $\color{#35bf28}+0.38\\%$ | | test_items_stack_nested_locked | 0.4383ms | 0.3670ms | 2.7249 KOps/s | 2.6991 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_keys | 23.0400μs | 4.4016μs | 227.1895 KOps/s | 227.9489 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_keys_nested | 94.6710μs | 65.3105μs | 15.3115 KOps/s | 15.1745 KOps/s | $\color{#35bf28}+0.90\\%$ | | test_keys_nested_locked | 0.6598ms | 71.4294μs | 13.9998 KOps/s | 13.6987 KOps/s | $\color{#35bf28}+2.20\\%$ | | test_keys_nested_leaf | 0.1321ms | 55.5583μs | 17.9991 KOps/s | 17.1668 KOps/s | $\color{#35bf28}+4.85\\%$ | | test_keys_stack_nested | 94.3310μs | 66.6308μs | 15.0081 KOps/s | 15.1841 KOps/s | $\color{#d91a1a}-1.16\\%$ | | test_keys_stack_nested_leaf | 83.9210μs | 57.5908μs | 17.3639 KOps/s | 17.3881 KOps/s | $\color{#d91a1a}-0.14\\%$ | | test_keys_stack_nested_locked | 98.0910μs | 71.6795μs | 13.9510 KOps/s | 13.7669 KOps/s | $\color{#35bf28}+1.34\\%$ | | test_values | 7.0970μs | 1.7571μs | 569.1118 KOps/s | 564.0378 KOps/s | $\color{#35bf28}+0.90\\%$ | | test_values_nested | 51.7620μs | 33.8433μs | 29.5480 KOps/s | 29.7443 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_values_nested_locked | 0.1464ms | 35.7390μs | 27.9807 KOps/s | 28.0975 KOps/s | $\color{#d91a1a}-0.42\\%$ | | test_values_nested_leaf | 0.1933ms | 30.4629μs | 32.8268 KOps/s | 33.2105 KOps/s | $\color{#d91a1a}-1.16\\%$ | | test_values_stack_nested | 59.6520μs | 34.6415μs | 28.8671 KOps/s | 28.7594 KOps/s | $\color{#35bf28}+0.37\\%$ | | test_values_stack_nested_leaf | 59.3110μs | 31.1342μs | 32.1191 KOps/s | 32.1861 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_values_stack_nested_locked | 64.6320μs | 36.6089μs | 27.3158 KOps/s | 27.3340 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_membership | 1.5180μs | 0.5512μs | 1.8141 MOps/s | 1.8274 MOps/s | $\color{#d91a1a}-0.73\\%$ | | test_membership_nested | 12.8750μs | 1.9235μs | 519.8862 KOps/s | 511.3732 KOps/s | $\color{#35bf28}+1.66\\%$ | | test_membership_nested_leaf | 10.4705μs | 1.9300μs | 518.1371 KOps/s | 515.0401 KOps/s | $\color{#35bf28}+0.60\\%$ | | test_membership_stacked_nested | 23.2190μs | 2.0221μs | 494.5367 KOps/s | 493.3609 KOps/s | $\color{#35bf28}+0.24\\%$ | | test_membership_stacked_nested_leaf | 16.6790μs | 2.0000μs | 500.0077 KOps/s | 493.0384 KOps/s | $\color{#35bf28}+1.41\\%$ | | test_membership_nested_last | 51.7210μs | 2.9175μs | 342.7607 KOps/s | 338.6853 KOps/s | $\color{#35bf28}+1.20\\%$ | | test_membership_nested_leaf_last | 0.1750ms | 2.9328μs | 340.9655 KOps/s | 338.2057 KOps/s | $\color{#35bf28}+0.82\\%$ | | test_membership_stacked_nested_last | 0.1953ms | 4.3334μs | 230.7647 KOps/s | 248.3980 KOps/s | $\textbf{\color{#d91a1a}-7.10\\%}$ | | test_membership_stacked_nested_leaf_last | 0.2025ms | 4.3164μs | 231.6754 KOps/s | 248.6319 KOps/s | $\textbf{\color{#d91a1a}-6.82\\%}$ | | test_nested_getleaf | 0.1917ms | 7.9751μs | 125.3903 KOps/s | 125.9944 KOps/s | $\color{#d91a1a}-0.48\\%$ | | test_nested_get | 22.0610μs | 7.4475μs | 134.2726 KOps/s | 134.1858 KOps/s | $\color{#35bf28}+0.06\\%$ | | test_stacked_getleaf | 0.1968ms | 8.0595μs | 124.0774 KOps/s | 124.5474 KOps/s | $\color{#d91a1a}-0.38\\%$ | | test_stacked_get | 0.2086ms | 7.4811μs | 133.6708 KOps/s | 133.4762 KOps/s | $\color{#35bf28}+0.15\\%$ | | test_nested_getitemleaf | 0.1847ms | 8.1027μs | 123.4149 KOps/s | 123.5252 KOps/s | $\color{#d91a1a}-0.09\\%$ | | test_nested_getitem | 30.3410μs | 7.6812μs | 130.1886 KOps/s | 131.3264 KOps/s | $\color{#d91a1a}-0.87\\%$ | | test_stacked_getitemleaf | 60.8900μs | 8.1171μs | 123.1964 KOps/s | 123.1673 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_stacked_getitem | 22.1610μs | 7.6504μs | 130.7124 KOps/s | 131.0770 KOps/s | $\color{#d91a1a}-0.28\\%$ | | test_lock_nested | 10.2961ms | 0.4874ms | 2.0517 KOps/s | 2.1209 KOps/s | $\color{#d91a1a}-3.26\\%$ | | test_lock_stack_nested | 0.4868ms | 0.4375ms | 2.2856 KOps/s | 2.3142 KOps/s | $\color{#d91a1a}-1.23\\%$ | | test_unlock_nested | 0.9117ms | 0.3957ms | 2.5272 KOps/s | 2.5390 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_unlock_stack_nested | 0.4346ms | 0.3552ms | 2.8152 KOps/s | 2.8328 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_flatten_speed | 0.4734ms | 0.1055ms | 9.4803 KOps/s | 9.5181 KOps/s | $\color{#d91a1a}-0.40\\%$ | | test_unflatten_speed | 0.4034ms | 0.2898ms | 3.4506 KOps/s | 3.4576 KOps/s | $\color{#d91a1a}-0.20\\%$ | | test_common_ops | 1.6216ms | 1.3844ms | 722.3286 Ops/s | 735.1604 Ops/s | $\color{#d91a1a}-1.75\\%$ | | test_creation | 16.3600μs | 1.6890μs | 592.0826 KOps/s | 601.3341 KOps/s | $\color{#d91a1a}-1.54\\%$ | | test_creation_empty | 45.8000μs | 18.5577μs | 53.8860 KOps/s | 57.2367 KOps/s | $\textbf{\color{#d91a1a}-5.85\\%}$ | | test_creation_nested_1 | 40.1610μs | 20.4378μs | 48.9289 KOps/s | 51.5544 KOps/s | $\textbf{\color{#d91a1a}-5.09\\%}$ | | test_creation_nested_2 | 54.6720μs | 23.1407μs | 43.2138 KOps/s | 46.2714 KOps/s | $\textbf{\color{#d91a1a}-6.61\\%}$ | | test_clone | 0.1834ms | 30.5466μs | 32.7369 KOps/s | 32.6494 KOps/s | $\color{#35bf28}+0.27\\%$ | | test_getitem[int] | 1.2506ms | 17.6580μs | 56.6315 KOps/s | 58.3835 KOps/s | $\color{#d91a1a}-3.00\\%$ | | test_getitem[slice_int] | 0.1627ms | 30.7200μs | 32.5521 KOps/s | 33.8352 KOps/s | $\color{#d91a1a}-3.79\\%$ | | test_getitem[range] | 0.2637ms | 0.1155ms | 8.6553 KOps/s | 8.6306 KOps/s | $\color{#35bf28}+0.29\\%$ | | test_getitem[tuple] | 91.3795ms | 32.4966μs | 30.7725 KOps/s | 39.4181 KOps/s | $\textbf{\color{#d91a1a}-21.93\\%}$ | | test_getitem[list] | 0.2814ms | 0.1073ms | 9.3231 KOps/s | 9.0074 KOps/s | $\color{#35bf28}+3.50\\%$ | | test_setitem_dim[int] | 0.2539ms | 61.1609μs | 16.3503 KOps/s | 18.3913 KOps/s | $\textbf{\color{#d91a1a}-11.10\\%}$ | | test_setitem_dim[slice_int] | 0.1082ms | 86.4198μs | 11.5714 KOps/s | 12.5850 KOps/s | $\textbf{\color{#d91a1a}-8.05\\%}$ | | test_setitem_dim[range] | 0.3005ms | 0.1519ms | 6.5812 KOps/s | 6.8317 KOps/s | $\color{#d91a1a}-3.67\\%$ | | test_setitem_dim[tuple] | 0.2271ms | 78.5217μs | 12.7353 KOps/s | 13.7238 KOps/s | $\textbf{\color{#d91a1a}-7.20\\%}$ | | test_setitem | 0.2259ms | 47.5742μs | 21.0198 KOps/s | 22.7922 KOps/s | $\textbf{\color{#d91a1a}-7.78\\%}$ | | test_set | 0.2373ms | 47.2933μs | 21.1447 KOps/s | 21.7576 KOps/s | $\color{#d91a1a}-2.82\\%$ | | test_set_shared | 0.3875ms | 54.4025μs | 18.3815 KOps/s | 18.6835 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_update | 0.2367ms | 56.2920μs | 17.7645 KOps/s | 19.2048 KOps/s | $\textbf{\color{#d91a1a}-7.50\\%}$ | | test_update_nested | 0.2454ms | 64.2482μs | 15.5646 KOps/s | 15.8850 KOps/s | $\color{#d91a1a}-2.02\\%$ | | test_update__nested | 0.2397ms | 62.2359μs | 16.0679 KOps/s | 15.0909 KOps/s | $\textbf{\color{#35bf28}+6.47\\%}$ | | test_set_nested | 0.2136ms | 47.2220μs | 21.1765 KOps/s | 21.7655 KOps/s | $\color{#d91a1a}-2.71\\%$ | | test_set_nested_new | 0.2241ms | 52.6854μs | 18.9806 KOps/s | 20.2901 KOps/s | $\textbf{\color{#d91a1a}-6.45\\%}$ | | test_select | 0.2518ms | 67.2362μs | 14.8729 KOps/s | 15.7432 KOps/s | $\textbf{\color{#d91a1a}-5.53\\%}$ | | test_select_nested | 0.3544ms | 51.3903μs | 19.4589 KOps/s | 19.2198 KOps/s | $\color{#35bf28}+1.24\\%$ | | test_exclude_nested | 99.2930μs | 69.3189μs | 14.4261 KOps/s | 14.0593 KOps/s | $\color{#35bf28}+2.61\\%$ | | test_empty[True] | 0.3764ms | 0.2813ms | 3.5551 KOps/s | 3.5227 KOps/s | $\color{#35bf28}+0.92\\%$ | | test_empty[False] | 2.5310μs | 0.8738μs | 1.1444 MOps/s | 1.1115 MOps/s | $\color{#35bf28}+2.96\\%$ | | test_to | 0.1461ms | 39.0608μs | 25.6011 KOps/s | 26.6857 KOps/s | $\color{#d91a1a}-4.06\\%$ | | test_to_nonblocking | 0.2110ms | 24.0198μs | 41.6323 KOps/s | 42.5500 KOps/s | $\color{#d91a1a}-2.16\\%$ | | test_unbind_speed | 0.3471ms | 0.3068ms | 3.2597 KOps/s | 3.2911 KOps/s | $\color{#d91a1a}-0.96\\%$ | | test_unbind_speed_stack0 | 0.4925ms | 0.3021ms | 3.3099 KOps/s | 3.2893 KOps/s | $\color{#35bf28}+0.63\\%$ | | test_unbind_speed_stack1 | 89.9226ms | 0.7718ms | 1.2957 KOps/s | 1.3035 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_split | 92.4666ms | 2.3587ms | 423.9645 Ops/s | 433.9268 Ops/s | $\color{#d91a1a}-2.30\\%$ | | test_chunk | 91.9040ms | 2.3651ms | 422.8216 Ops/s | 431.4869 Ops/s | $\color{#d91a1a}-2.01\\%$ | | test_creation[device0] | 0.2196ms | 0.1053ms | 9.5007 KOps/s | 9.2367 KOps/s | $\color{#35bf28}+2.86\\%$ | | test_creation_from_tensor | 0.3050ms | 0.1025ms | 9.7546 KOps/s | 9.8833 KOps/s | $\color{#d91a1a}-1.30\\%$ | | test_add_one[memmap_tensor0] | 55.5810μs | 9.3443μs | 107.0176 KOps/s | 105.4340 KOps/s | $\color{#35bf28}+1.50\\%$ | | test_contiguous[memmap_tensor0] | 14.4410μs | 2.2232μs | 449.8105 KOps/s | 454.4823 KOps/s | $\color{#d91a1a}-1.03\\%$ | | test_stack[memmap_tensor0] | 24.0010μs | 6.8548μs | 145.8821 KOps/s | 145.7256 KOps/s | $\color{#35bf28}+0.11\\%$ | | test_memmaptd_index | 1.1939ms | 0.4534ms | 2.2055 KOps/s | 2.2789 KOps/s | $\color{#d91a1a}-3.22\\%$ | | test_memmaptd_index_astensor | 0.7977ms | 0.5231ms | 1.9116 KOps/s | 1.9905 KOps/s | $\color{#d91a1a}-3.97\\%$ | | test_memmaptd_index_op | 1.5511ms | 1.1206ms | 892.4125 Ops/s | 913.8752 Ops/s | $\color{#d91a1a}-2.35\\%$ | | test_serialize_model | 0.1020s | 96.4903ms | 10.3637 Ops/s | 10.0750 Ops/s | $\color{#35bf28}+2.87\\%$ | | test_serialize_model_pickle | 1.3506s | 1.2364s | 0.8088 Ops/s | 0.8078 Ops/s | $\color{#35bf28}+0.13\\%$ | | test_serialize_weights | 0.1905s | 0.1031s | 9.6982 Ops/s | 10.2481 Ops/s | $\textbf{\color{#d91a1a}-5.37\\%}$ | | test_serialize_weights_returnearly | 82.8818ms | 72.3194ms | 13.8275 Ops/s | 11.2965 Ops/s | $\textbf{\color{#35bf28}+22.41\\%}$ | | test_serialize_weights_pickle | 1.3473s | 1.2360s | 0.8091 Ops/s | 0.8031 Ops/s | $\color{#35bf28}+0.74\\%$ | | test_reshape_pytree | 0.1598ms | 39.3238μs | 25.4299 KOps/s | 25.9697 KOps/s | $\color{#d91a1a}-2.08\\%$ | | test_reshape_td | 0.1252ms | 44.6637μs | 22.3895 KOps/s | 22.9992 KOps/s | $\color{#d91a1a}-2.65\\%$ | | test_view_pytree | 0.1673ms | 38.3999μs | 26.0417 KOps/s | 26.3963 KOps/s | $\color{#d91a1a}-1.34\\%$ | | test_view_td | 0.1432ms | 51.1968μs | 19.5325 KOps/s | 19.6550 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_unbind_pytree | 0.1587ms | 37.5314μs | 26.6443 KOps/s | 26.8177 KOps/s | $\color{#d91a1a}-0.65\\%$ | | test_unbind_td | 0.3990ms | 46.3213μs | 21.5883 KOps/s | 21.9268 KOps/s | $\color{#d91a1a}-1.54\\%$ | | test_split_pytree | 0.1815ms | 51.9565μs | 19.2469 KOps/s | 19.4131 KOps/s | $\color{#d91a1a}-0.86\\%$ | | test_split_td | 0.4613ms | 67.6265μs | 14.7871 KOps/s | 16.5373 KOps/s | $\textbf{\color{#d91a1a}-10.58\\%}$ | | test_add_pytree | 0.2518ms | 65.6385μs | 15.2350 KOps/s | 16.2881 KOps/s | $\textbf{\color{#d91a1a}-6.47\\%}$ | | test_add_td | 0.2831ms | 0.1054ms | 9.4845 KOps/s | 10.4696 KOps/s | $\textbf{\color{#d91a1a}-9.41\\%}$ | | test_compile_add_one_nested[tensordict-compile] | 0.4119ms | 0.2071ms | 4.8274 KOps/s | 4.7525 KOps/s | $\color{#35bf28}+1.58\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.3254ms | 0.1739ms | 5.7506 KOps/s | 5.8247 KOps/s | $\color{#d91a1a}-1.27\\%$ | | test_compile_add_one_nested[pytree-compile] | 0.2968ms | 0.1462ms | 6.8405 KOps/s | 6.7943 KOps/s | $\color{#35bf28}+0.68\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.3854ms | 0.2136ms | 4.6810 KOps/s | 5.0690 KOps/s | $\textbf{\color{#d91a1a}-7.65\\%}$ | | test_compile_copy_nested[tensordict-compile] | 0.1012ms | 22.1966μs | 45.0519 KOps/s | 45.0828 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_compile_copy_nested[tensordict-eager] | 86.7710μs | 49.1122μs | 20.3615 KOps/s | 20.6296 KOps/s | $\color{#d91a1a}-1.30\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1665ms | 73.3163μs | 13.6395 KOps/s | 13.7356 KOps/s | $\color{#d91a1a}-0.70\\%$ | | test_compile_copy_nested[pytree-eager] | 83.2410μs | 59.8690μs | 16.7031 KOps/s | 16.8437 KOps/s | $\color{#d91a1a}-0.83\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.4589ms | 0.3287ms | 3.0423 KOps/s | 3.0576 KOps/s | $\color{#d91a1a}-0.50\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.3648ms | 0.2215ms | 4.5146 KOps/s | 4.5388 KOps/s | $\color{#d91a1a}-0.53\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.2803ms | 0.1304ms | 7.6672 KOps/s | 7.6697 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 0.2455ms | 62.9788μs | 15.8783 KOps/s | 16.2107 KOps/s | $\color{#d91a1a}-2.05\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.4479ms | 0.3263ms | 3.0650 KOps/s | 3.0474 KOps/s | $\color{#35bf28}+0.58\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.8965ms | 0.6957ms | 1.4373 KOps/s | 1.5603 KOps/s | $\textbf{\color{#d91a1a}-7.88\\%}$ | | test_compile_add_self_flat[tensordict-eager] | 0.4130ms | 0.2722ms | 3.6744 KOps/s | 3.6844 KOps/s | $\color{#d91a1a}-0.27\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.4772ms | 0.3290ms | 3.0394 KOps/s | 3.0155 KOps/s | $\color{#35bf28}+0.79\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.2516ms | 79.4513μs | 12.5863 KOps/s | 13.3017 KOps/s | $\textbf{\color{#d91a1a}-5.38\\%}$ | | test_compile_add_self_flat[tensorclass-compile] | 0.2956ms | 0.1377ms | 7.2619 KOps/s | 7.6248 KOps/s | $\color{#d91a1a}-4.76\\%$ | | test_compile_add_self_flat[pytree-eager] | 0.7549ms | 0.5868ms | 1.7043 KOps/s | 1.8341 KOps/s | $\textbf{\color{#d91a1a}-7.08\\%}$ | | test_compile_add_self_flat[pytree-compile] | 0.4746ms | 0.3274ms | 3.0540 KOps/s | 3.0635 KOps/s | $\color{#d91a1a}-0.31\\%$ | | test_compile_copy_flat[tensordict-compile] | 0.2086ms | 20.3928μs | 49.0369 KOps/s | 52.2699 KOps/s | $\textbf{\color{#d91a1a}-6.19\\%}$ | | test_compile_copy_flat[tensordict-eager] | 0.2195ms | 34.2245μs | 29.2189 KOps/s | 29.6463 KOps/s | $\color{#d91a1a}-1.44\\%$ | | test_compile_copy_flat[pytree-compile] | 0.2755ms | 77.1040μs | 12.9695 KOps/s | 13.0502 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_compile_copy_flat[pytree-eager] | 89.5320μs | 60.7158μs | 16.4702 KOps/s | 16.5482 KOps/s | $\color{#d91a1a}-0.47\\%$ | | test_compile_assign_and_add[tensordict-compile] | 2.5563ms | 0.9331ms | 1.0717 KOps/s | 1.0711 KOps/s | $\color{#35bf28}+0.06\\%$ | | test_compile_assign_and_add[tensordict-eager] | 3.5655ms | 3.3466ms | 298.8066 Ops/s | 296.5985 Ops/s | $\color{#35bf28}+0.74\\%$ | | test_compile_assign_and_add[pytree-compile] | 2.5279ms | 0.9239ms | 1.0824 KOps/s | 1.0874 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_compile_assign_and_add[pytree-eager] | 3.6793ms | 3.3840ms | 295.5045 Ops/s | 298.5145 Ops/s | $\color{#d91a1a}-1.01\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 0.2419ms | 0.1104ms | 9.0584 KOps/s | 9.0582 KOps/s | $+0.00\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 0.2641ms | 67.8937μs | 14.7289 KOps/s | 15.2562 KOps/s | $\color{#d91a1a}-3.46\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 0.2549ms | 0.1031ms | 9.7009 KOps/s | 9.6520 KOps/s | $\color{#35bf28}+0.51\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 0.2205ms | 49.1099μs | 20.3625 KOps/s | 20.5067 KOps/s | $\color{#d91a1a}-0.70\\%$ | | test_compile_indexing[tensor-pytree-compile] | 0.2804ms | 0.1085ms | 9.2189 KOps/s | 9.2593 KOps/s | $\color{#d91a1a}-0.44\\%$ | | test_compile_indexing[tensor-pytree-eager] | 0.2246ms | 49.5299μs | 20.1898 KOps/s | 20.5334 KOps/s | $\color{#d91a1a}-1.67\\%$ | | test_compile_indexing[slice-tensordict-compile] | 0.2754ms | 0.1398ms | 7.1547 KOps/s | 6.9950 KOps/s | $\color{#35bf28}+2.28\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.3037ms | 33.4289μs | 29.9142 KOps/s | 37.5567 KOps/s | $\textbf{\color{#d91a1a}-20.35\\%}$ | | test_compile_indexing[slice-tensorclass-compile] | 0.2951ms | 0.1314ms | 7.6108 KOps/s | 7.6630 KOps/s | $\color{#d91a1a}-0.68\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 0.1912ms | 24.6241μs | 40.6105 KOps/s | 43.4758 KOps/s | $\textbf{\color{#d91a1a}-6.59\\%}$ | | test_compile_indexing[slice-pytree-compile] | 0.3207ms | 0.1372ms | 7.2896 KOps/s | 7.6468 KOps/s | $\color{#d91a1a}-4.67\\%$ | | test_compile_indexing[slice-pytree-eager] | 0.1238ms | 25.4162μs | 39.3450 KOps/s | 44.1052 KOps/s | $\textbf{\color{#d91a1a}-10.79\\%}$ | | test_compile_indexing[int-tensordict-compile] | 0.3399ms | 0.1452ms | 6.8869 KOps/s | 7.2338 KOps/s | $\color{#d91a1a}-4.79\\%$ | | test_compile_indexing[int-tensordict-eager] | 0.4928ms | 28.9402μs | 34.5540 KOps/s | 39.3673 KOps/s | $\textbf{\color{#d91a1a}-12.23\\%}$ | | test_compile_indexing[int-tensorclass-compile] | 0.3230ms | 0.1369ms | 7.3037 KOps/s | 7.5048 KOps/s | $\color{#d91a1a}-2.68\\%$ | | test_compile_indexing[int-tensorclass-eager] | 87.6820μs | 25.8916μs | 38.6226 KOps/s | 44.1339 KOps/s | $\textbf{\color{#d91a1a}-12.49\\%}$ | | test_compile_indexing[int-pytree-compile] | 0.3237ms | 0.1371ms | 7.2962 KOps/s | 7.4418 KOps/s | $\color{#d91a1a}-1.96\\%$ | | test_compile_indexing[int-pytree-eager] | 0.1854ms | 25.6936μs | 38.9202 KOps/s | 43.8622 KOps/s | $\textbf{\color{#d91a1a}-11.27\\%}$ | | test_mod_add[eager] | 0.1890ms | 40.3760μs | 24.7672 KOps/s | 26.0630 KOps/s | $\color{#d91a1a}-4.97\\%$ | | test_mod_add[compile] | 0.2357ms | 71.0253μs | 14.0795 KOps/s | 13.6058 KOps/s | $\color{#35bf28}+3.48\\%$ | | test_mod_add[compile-overhead] | 0.2613ms | 0.1472ms | 6.7935 KOps/s | 6.5906 KOps/s | $\color{#35bf28}+3.08\\%$ | | test_mod_wrap[eager] | 0.4494ms | 0.2575ms | 3.8837 KOps/s | 3.6620 KOps/s | $\textbf{\color{#35bf28}+6.05\\%}$ | | test_mod_wrap[compile] | 0.4520ms | 0.2967ms | 3.3707 KOps/s | 3.2210 KOps/s | $\color{#35bf28}+4.65\\%$ | | test_mod_wrap[compile-overhead] | 8.1783ms | 4.3328ms | 230.7996 Ops/s | 238.1981 Ops/s | $\color{#d91a1a}-3.11\\%$ | | test_mod_wrap_and_backward[eager] | 1.6902ms | 1.4674ms | 681.4798 Ops/s | 683.9009 Ops/s | $\color{#d91a1a}-0.35\\%$ | | test_mod_wrap_and_backward[compile] | 2.0412ms | 1.4705ms | 680.0398 Ops/s | 690.1132 Ops/s | $\color{#d91a1a}-1.46\\%$ | | test_mod_wrap_and_backward[compile-overhead] | 1.5129ms | 1.0478ms | 954.3974 Ops/s | 998.4733 Ops/s | $\color{#d91a1a}-4.41\\%$ | | test_seq_add[eager] | 0.2634ms | 0.1165ms | 8.5815 KOps/s | 8.9314 KOps/s | $\color{#d91a1a}-3.92\\%$ | | test_seq_add[compile] | 0.2366ms | 87.4619μs | 11.4336 KOps/s | 11.6069 KOps/s | $\color{#d91a1a}-1.49\\%$ | | test_seq_add[compile-overhead] | 0.2695ms | 0.1233ms | 8.1124 KOps/s | 8.1763 KOps/s | $\color{#d91a1a}-0.78\\%$ | | test_seq_wrap[eager] | 0.5982ms | 0.4348ms | 2.2998 KOps/s | 2.3369 KOps/s | $\color{#d91a1a}-1.59\\%$ | | test_seq_wrap[compile] | 0.5258ms | 0.3418ms | 2.9260 KOps/s | 3.0692 KOps/s | $\color{#d91a1a}-4.67\\%$ | | test_seq_wrap[compile-overhead] | 0.3064s | 0.1466s | 6.8226 Ops/s | 6.7415 Ops/s | $\color{#35bf28}+1.20\\%$ | | test_func_call_runtime[False-eager] | 0.9151ms | 0.7604ms | 1.3152 KOps/s | 1.2417 KOps/s | $\textbf{\color{#35bf28}+5.92\\%}$ | | test_func_call_runtime[False-compile] | 1.0340ms | 0.8306ms | 1.2039 KOps/s | 1.2410 KOps/s | $\color{#d91a1a}-2.99\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.5229ms | 0.3696ms | 2.7057 KOps/s | 2.7207 KOps/s | $\color{#d91a1a}-0.55\\%$ | | test_func_call_runtime[True-eager] | 1.1847ms | 1.0128ms | 987.3654 Ops/s | 998.7201 Ops/s | $\color{#d91a1a}-1.14\\%$ | | test_func_call_runtime[True-compile] | 1.0663ms | 0.8936ms | 1.1191 KOps/s | 1.1519 KOps/s | $\color{#d91a1a}-2.85\\%$ | | test_func_call_runtime[True-compile-overhead] | 0.6171ms | 0.4126ms | 2.4238 KOps/s | 2.4643 KOps/s | $\color{#d91a1a}-1.64\\%$ | | test_distributed | 0.3072ms | 72.8436μs | 13.7280 KOps/s | 11.2698 KOps/s | $\textbf{\color{#35bf28}+21.81\\%}$ | | test_tdmodule | 93.7430μs | 17.8358μs | 56.0669 KOps/s | 61.9535 KOps/s | $\textbf{\color{#d91a1a}-9.50\\%}$ | | test_tdmodule_dispatch | 56.4920μs | 34.9412μs | 28.6195 KOps/s | 30.9324 KOps/s | $\textbf{\color{#d91a1a}-7.48\\%}$ | | test_tdseq | 33.9210μs | 18.1192μs | 55.1901 KOps/s | 58.7369 KOps/s | $\textbf{\color{#d91a1a}-6.04\\%}$ | | test_tdseq_dispatch | 54.8610μs | 37.2123μs | 26.8728 KOps/s | 28.6896 KOps/s | $\textbf{\color{#d91a1a}-6.33\\%}$ | | test_instantiation_functorch | 2.2186ms | 2.0219ms | 494.5821 Ops/s | 496.2358 Ops/s | $\color{#d91a1a}-0.33\\%$ | | test_instantiation_td | 2.1187ms | 1.3053ms | 766.0937 Ops/s | 766.7728 Ops/s | $\color{#d91a1a}-0.09\\%$ | | test_exec_functorch | 0.3823ms | 0.2314ms | 4.3215 KOps/s | 4.3784 KOps/s | $\color{#d91a1a}-1.30\\%$ | | test_exec_functional_call | 0.3924ms | 0.2235ms | 4.4733 KOps/s | 4.5256 KOps/s | $\color{#d91a1a}-1.16\\%$ | | test_exec_td | 0.3395ms | 0.2227ms | 4.4910 KOps/s | 4.5500 KOps/s | $\color{#d91a1a}-1.30\\%$ | | test_exec_td_decorator | 0.4832ms | 0.2928ms | 3.4158 KOps/s | 3.4112 KOps/s | $\color{#35bf28}+0.13\\%$ | | test_vmap_mlp_speed[True-True] | 0.8484ms | 0.6763ms | 1.4786 KOps/s | 1.4854 KOps/s | $\color{#d91a1a}-0.46\\%$ | | test_vmap_mlp_speed[True-False] | 0.8245ms | 0.6725ms | 1.4869 KOps/s | 1.4851 KOps/s | $\color{#35bf28}+0.12\\%$ | | test_vmap_mlp_speed[False-True] | 0.7950ms | 0.6104ms | 1.6384 KOps/s | 1.6941 KOps/s | $\color{#d91a1a}-3.29\\%$ | | test_vmap_mlp_speed[False-False] | 0.7319ms | 0.5862ms | 1.7059 KOps/s | 1.6905 KOps/s | $\color{#35bf28}+0.91\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.4455ms | 0.7546ms | 1.3252 KOps/s | 1.3217 KOps/s | $\color{#35bf28}+0.26\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.9662ms | 0.7565ms | 1.3219 KOps/s | 1.3360 KOps/s | $\color{#d91a1a}-1.06\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8635ms | 0.6549ms | 1.5270 KOps/s | 1.5329 KOps/s | $\color{#d91a1a}-0.39\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8753ms | 0.6714ms | 1.4894 KOps/s | 1.5283 KOps/s | $\color{#d91a1a}-2.54\\%$ | | test_vmap_transformer_speed[True-True] | 9.0161ms | 8.8441ms | 113.0692 Ops/s | 112.8424 Ops/s | $\color{#35bf28}+0.20\\%$ | | test_vmap_transformer_speed[True-False] | 9.0276ms | 8.8478ms | 113.0228 Ops/s | 113.2265 Ops/s | $\color{#d91a1a}-0.18\\%$ | | test_vmap_transformer_speed[False-True] | 9.0096ms | 8.7447ms | 114.3554 Ops/s | 113.9865 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_vmap_transformer_speed[False-False] | 8.9234ms | 8.7492ms | 114.2957 Ops/s | 113.9762 Ops/s | $\color{#35bf28}+0.28\\%$ | | test_vmap_transformer_speed_decorator[True-True] | 21.3060ms | 21.1540ms | 47.2723 Ops/s | 47.3550 Ops/s | $\color{#d91a1a}-0.17\\%$ | | test_vmap_transformer_speed_decorator[True-False] | 21.7883ms | 21.0110ms | 47.5941 Ops/s | 47.4003 Ops/s | $\color{#35bf28}+0.41\\%$ | | test_vmap_transformer_speed_decorator[False-True] | 21.7800ms | 20.9186ms | 47.8043 Ops/s | 48.0636 Ops/s | $\color{#d91a1a}-0.54\\%$ | | test_vmap_transformer_speed_decorator[False-False] | 21.0937ms | 20.8958ms | 47.8565 Ops/s | 47.9366 Ops/s | $\color{#d91a1a}-0.17\\%$ | | test_to_module_speed[True] | 1.5934ms | 1.4747ms | 678.1198 Ops/s | 684.1132 Ops/s | $\color{#d91a1a}-0.88\\%$ | | test_to_module_speed[False] | 1.5751ms | 1.4660ms | 682.1376 Ops/s | 709.2438 Ops/s | $\color{#d91a1a}-3.82\\%$ | | test_tc_init | 60.7920μs | 41.7024μs | 23.9795 KOps/s | 25.0292 KOps/s | $\color{#d91a1a}-4.19\\%$ | | test_tc_init_nested | 0.1126ms | 84.1799μs | 11.8793 KOps/s | 13.0490 KOps/s | $\textbf{\color{#d91a1a}-8.96\\%}$ | | test_tc_first_layer_tensor | 12.5737μs | 0.7941μs | 1.2593 MOps/s | 1.2982 MOps/s | $\color{#d91a1a}-3.00\\%$ | | test_tc_first_layer_nontensor | 18.1400μs | 2.5551μs | 391.3724 KOps/s | 394.0249 KOps/s | $\color{#d91a1a}-0.67\\%$ | | test_tc_second_layer_tensor | 7.1800μs | 1.5923μs | 628.0135 KOps/s | 614.2557 KOps/s | $\color{#35bf28}+2.24\\%$ | | test_tc_second_layer_nontensor | 18.3610μs | 3.3610μs | 297.5308 KOps/s | 298.5549 KOps/s | $\color{#d91a1a}-0.34\\%$ | | test_unbind | 0.3197s | 13.0300ms | 76.7457 Ops/s | 80.6878 Ops/s | $\color{#d91a1a}-4.89\\%$ | | test_full_like | 0.7500ms | 0.5791ms | 1.7269 KOps/s | 1.7241 KOps/s | $\color{#35bf28}+0.16\\%$ | | test_zeros_like | 0.3442ms | 0.1979ms | 5.0523 KOps/s | 5.0558 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_ones_like | 0.3707ms | 0.1978ms | 5.0557 KOps/s | 5.0604 KOps/s | $\color{#d91a1a}-0.09\\%$ | | test_clone | 0.6021ms | 0.4142ms | 2.4143 KOps/s | 2.4084 KOps/s | $\color{#35bf28}+0.25\\%$ | | test_squeeze | 40.4710μs | 11.0903μs | 90.1685 KOps/s | 91.9713 KOps/s | $\color{#d91a1a}-1.96\\%$ | | test_unsqueeze | 0.2475ms | 82.7559μs | 12.0837 KOps/s | 12.6613 KOps/s | $\color{#d91a1a}-4.56\\%$ | | test_split | 0.4482ms | 0.1771ms | 5.6478 KOps/s | 5.7525 KOps/s | $\color{#d91a1a}-1.82\\%$ | | test_permute | 0.3401ms | 0.1894ms | 5.2792 KOps/s | 5.2417 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_stack | 1.3217ms | 0.8971ms | 1.1147 KOps/s | 1.1016 KOps/s | $\color{#35bf28}+1.19\\%$ | | test_cat | 1.3725ms | 1.2317ms | 811.8639 Ops/s | 811.6479 Ops/s | $\color{#35bf28}+0.03\\%$ |