pytorch / tensordict

TensorDict is a pytorch dedicated tensor container.
MIT License
803 stars 65 forks source link

[Formatting] Lint revamp #890

Closed vmoens closed 1 month ago

github-actions[bot] commented 1 month ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}24$. Worsened: $\large\color{#d91a1a}24$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | ------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 51.9170μs | 22.2302μs | 44.9838 KOps/s | 47.5065 KOps/s | $\textbf{\color{#d91a1a}-5.31\\%}$ | | test_plain_set_stack_nested | 75.0500μs | 22.2160μs | 45.0126 KOps/s | 47.7308 KOps/s | $\textbf{\color{#d91a1a}-5.69\\%}$ | | test_plain_set_nested_inplace | 73.1870μs | 23.7671μs | 42.0750 KOps/s | 43.3261 KOps/s | $\color{#d91a1a}-2.89\\%$ | | test_plain_set_stack_nested_inplace | 57.5580μs | 23.8153μs | 41.9898 KOps/s | 43.8723 KOps/s | $\color{#d91a1a}-4.29\\%$ | | test_items | 27.3810μs | 2.6317μs | 379.9767 KOps/s | 362.3290 KOps/s | $\color{#35bf28}+4.87\\%$ | | test_items_nested | 0.6212ms | 0.3364ms | 2.9724 KOps/s | 2.9103 KOps/s | $\color{#35bf28}+2.13\\%$ | | test_items_nested_locked | 3.7514ms | 0.3382ms | 2.9566 KOps/s | 2.9186 KOps/s | $\color{#35bf28}+1.30\\%$ | | test_items_nested_leaf | 0.1636ms | 84.7656μs | 11.7972 KOps/s | 11.4593 KOps/s | $\color{#35bf28}+2.95\\%$ | | test_items_stack_nested | 0.4627ms | 0.3381ms | 2.9578 KOps/s | 2.8775 KOps/s | $\color{#35bf28}+2.79\\%$ | | test_items_stack_nested_leaf | 0.1656ms | 86.9345μs | 11.5029 KOps/s | 11.2611 KOps/s | $\color{#35bf28}+2.15\\%$ | | test_items_stack_nested_locked | 0.6110ms | 0.3399ms | 2.9417 KOps/s | 2.8664 KOps/s | $\color{#35bf28}+2.63\\%$ | | test_keys | 27.6220μs | 3.8777μs | 257.8828 KOps/s | 246.4218 KOps/s | $\color{#35bf28}+4.65\\%$ | | test_keys_nested | 0.2725ms | 0.1426ms | 7.0131 KOps/s | 6.7321 KOps/s | $\color{#35bf28}+4.17\\%$ | | test_keys_nested_locked | 0.6868ms | 0.1495ms | 6.6904 KOps/s | 6.4929 KOps/s | $\color{#35bf28}+3.04\\%$ | | test_keys_nested_leaf | 0.2443ms | 0.1235ms | 8.0983 KOps/s | 7.7488 KOps/s | $\color{#35bf28}+4.51\\%$ | | test_keys_stack_nested | 0.2465ms | 0.1453ms | 6.8837 KOps/s | 6.6985 KOps/s | $\color{#35bf28}+2.77\\%$ | | test_keys_stack_nested_leaf | 0.4011ms | 0.1227ms | 8.1516 KOps/s | 7.7773 KOps/s | $\color{#35bf28}+4.81\\%$ | | test_keys_stack_nested_locked | 0.2565ms | 0.1479ms | 6.7612 KOps/s | 6.4624 KOps/s | $\color{#35bf28}+4.62\\%$ | | test_values | 8.0600μs | 1.2014μs | 832.3294 KOps/s | 818.3503 KOps/s | $\color{#35bf28}+1.71\\%$ | | test_values_nested | 92.8830μs | 50.4347μs | 19.8276 KOps/s | 19.0584 KOps/s | $\color{#35bf28}+4.04\\%$ | | test_values_nested_locked | 0.1144ms | 50.1189μs | 19.9526 KOps/s | 19.1491 KOps/s | $\color{#35bf28}+4.20\\%$ | | test_values_nested_leaf | 83.7770μs | 44.9517μs | 22.2461 KOps/s | 21.2515 KOps/s | $\color{#35bf28}+4.68\\%$ | | test_values_stack_nested | 0.1011ms | 50.9917μs | 19.6110 KOps/s | 17.9424 KOps/s | $\textbf{\color{#35bf28}+9.30\\%}$ | | test_values_stack_nested_leaf | 86.7220μs | 45.0286μs | 22.2081 KOps/s | 21.2023 KOps/s | $\color{#35bf28}+4.74\\%$ | | test_values_stack_nested_locked | 96.5200μs | 50.5183μs | 19.7948 KOps/s | 18.8525 KOps/s | $\color{#35bf28}+5.00\\%$ | | test_membership | 29.7250μs | 0.9111μs | 1.0976 MOps/s | 1.3344 MOps/s | $\textbf{\color{#d91a1a}-17.75\\%}$ | | test_membership_nested | 46.2770μs | 2.5700μs | 389.0983 KOps/s | 364.8200 KOps/s | $\textbf{\color{#35bf28}+6.65\\%}$ | | test_membership_nested_leaf | 25.7690μs | 2.5877μs | 386.4384 KOps/s | 366.6853 KOps/s | $\textbf{\color{#35bf28}+5.39\\%}$ | | test_membership_stacked_nested | 21.0590μs | 2.5435μs | 393.1518 KOps/s | 369.0570 KOps/s | $\textbf{\color{#35bf28}+6.53\\%}$ | | test_membership_stacked_nested_leaf | 26.1390μs | 2.5783μs | 387.8493 KOps/s | 367.3321 KOps/s | $\textbf{\color{#35bf28}+5.59\\%}$ | | test_membership_nested_last | 36.9990μs | 3.8418μs | 260.2930 KOps/s | 236.8635 KOps/s | $\textbf{\color{#35bf28}+9.89\\%}$ | | test_membership_nested_leaf_last | 26.8400μs | 3.8608μs | 259.0125 KOps/s | 242.4193 KOps/s | $\textbf{\color{#35bf28}+6.84\\%}$ | | test_membership_stacked_nested_last | 21.7510μs | 3.8140μs | 262.1938 KOps/s | 176.3959 KOps/s | $\textbf{\color{#35bf28}+48.64\\%}$ | | test_membership_stacked_nested_leaf_last | 24.4260μs | 3.8331μs | 260.8884 KOps/s | 174.6704 KOps/s | $\textbf{\color{#35bf28}+49.36\\%}$ | | test_nested_getleaf | 66.9550μs | 10.4013μs | 96.1415 KOps/s | 93.8801 KOps/s | $\color{#35bf28}+2.41\\%$ | | test_nested_get | 30.6370μs | 9.8781μs | 101.2345 KOps/s | 100.5112 KOps/s | $\color{#35bf28}+0.72\\%$ | | test_stacked_getleaf | 42.5090μs | 10.4883μs | 95.3448 KOps/s | 94.5101 KOps/s | $\color{#35bf28}+0.88\\%$ | | test_stacked_get | 51.3760μs | 9.8040μs | 101.9994 KOps/s | 99.1108 KOps/s | $\color{#35bf28}+2.91\\%$ | | test_nested_getitemleaf | 34.6150μs | 10.9573μs | 91.2630 KOps/s | 90.4227 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_nested_getitem | 34.4650μs | 10.1396μs | 98.6235 KOps/s | 96.1791 KOps/s | $\color{#35bf28}+2.54\\%$ | | test_stacked_getitemleaf | 35.6460μs | 10.8277μs | 92.3559 KOps/s | 90.9170 KOps/s | $\color{#35bf28}+1.58\\%$ | | test_stacked_getitem | 35.3460μs | 9.9426μs | 100.5775 KOps/s | 97.6100 KOps/s | $\color{#35bf28}+3.04\\%$ | | test_lock_nested | 79.2668ms | 0.5627ms | 1.7770 KOps/s | 1.9867 KOps/s | $\textbf{\color{#d91a1a}-10.55\\%}$ | | test_lock_stack_nested | 1.0204ms | 0.4646ms | 2.1523 KOps/s | 2.1306 KOps/s | $\color{#35bf28}+1.02\\%$ | | test_unlock_nested | 91.1892ms | 0.5037ms | 1.9852 KOps/s | 2.3641 KOps/s | $\textbf{\color{#d91a1a}-16.03\\%}$ | | test_unlock_stack_nested | 0.6866ms | 0.3811ms | 2.6241 KOps/s | 2.5241 KOps/s | $\color{#35bf28}+3.96\\%$ | | test_flatten_speed | 0.5659ms | 0.1036ms | 9.6509 KOps/s | 9.4507 KOps/s | $\color{#35bf28}+2.12\\%$ | | test_unflatten_speed | 0.7617ms | 0.4241ms | 2.3578 KOps/s | 2.2995 KOps/s | $\color{#35bf28}+2.54\\%$ | | test_common_ops | 3.7585ms | 1.1046ms | 905.2727 Ops/s | 940.4817 Ops/s | $\color{#d91a1a}-3.74\\%$ | | test_creation | 26.9900μs | 2.0275μs | 493.2078 KOps/s | 478.8413 KOps/s | $\color{#35bf28}+3.00\\%$ | | test_creation_empty | 57.9090μs | 18.4642μs | 54.1587 KOps/s | 62.3036 KOps/s | $\textbf{\color{#d91a1a}-13.07\\%}$ | | test_creation_nested_1 | 59.2510μs | 21.7146μs | 46.0520 KOps/s | 51.4818 KOps/s | $\textbf{\color{#d91a1a}-10.55\\%}$ | | test_creation_nested_2 | 74.2390μs | 25.2057μs | 39.6736 KOps/s | 44.0385 KOps/s | $\textbf{\color{#d91a1a}-9.91\\%}$ | | test_clone | 96.7410μs | 17.3924μs | 57.4965 KOps/s | 58.5282 KOps/s | $\color{#d91a1a}-1.76\\%$ | | test_getitem[int] | 1.2931ms | 17.8365μs | 56.0647 KOps/s | 60.0708 KOps/s | $\textbf{\color{#d91a1a}-6.67\\%}$ | | test_getitem[slice_int] | 0.1249ms | 32.0218μs | 31.2287 KOps/s | 31.8239 KOps/s | $\color{#d91a1a}-1.87\\%$ | | test_getitem[range] | 0.1546ms | 57.0312μs | 17.5343 KOps/s | 17.3355 KOps/s | $\color{#35bf28}+1.15\\%$ | | test_getitem[tuple] | 0.1203ms | 26.4995μs | 37.7365 KOps/s | 39.8509 KOps/s | $\textbf{\color{#d91a1a}-5.31\\%}$ | | test_getitem[list] | 0.2716ms | 52.0391μs | 19.2163 KOps/s | 19.1038 KOps/s | $\color{#35bf28}+0.59\\%$ | | test_setitem_dim[int] | 93.1940μs | 41.1957μs | 24.2744 KOps/s | 26.0840 KOps/s | $\textbf{\color{#d91a1a}-6.94\\%}$ | | test_setitem_dim[slice_int] | 0.1143ms | 72.7875μs | 13.7386 KOps/s | 14.6791 KOps/s | $\textbf{\color{#d91a1a}-6.41\\%}$ | | test_setitem_dim[range] | 0.1568ms | 91.9861μs | 10.8712 KOps/s | 10.9545 KOps/s | $\color{#d91a1a}-0.76\\%$ | | test_setitem_dim[tuple] | 0.1065ms | 59.6315μs | 16.7697 KOps/s | 17.7453 KOps/s | $\textbf{\color{#d91a1a}-5.50\\%}$ | | test_setitem | 0.1400ms | 29.3009μs | 34.1286 KOps/s | 35.6150 KOps/s | $\color{#d91a1a}-4.17\\%$ | | test_set | 0.1271ms | 28.9073μs | 34.5934 KOps/s | 36.7233 KOps/s | $\textbf{\color{#d91a1a}-5.80\\%}$ | | test_set_shared | 4.5651ms | 0.2154ms | 4.6417 KOps/s | 4.5439 KOps/s | $\color{#35bf28}+2.15\\%$ | | test_update | 0.1574ms | 35.1387μs | 28.4587 KOps/s | 30.0151 KOps/s | $\textbf{\color{#d91a1a}-5.19\\%}$ | | test_update_nested | 0.1459ms | 46.2129μs | 21.6390 KOps/s | 22.6261 KOps/s | $\color{#d91a1a}-4.36\\%$ | | test_update__nested | 0.1379ms | 34.5208μs | 28.9680 KOps/s | 27.8098 KOps/s | $\color{#35bf28}+4.16\\%$ | | test_set_nested | 0.2024ms | 30.7500μs | 32.5204 KOps/s | 32.4722 KOps/s | $\color{#35bf28}+0.15\\%$ | | test_set_nested_new | 0.1422ms | 35.7106μs | 28.0029 KOps/s | 28.4930 KOps/s | $\color{#d91a1a}-1.72\\%$ | | test_select | 0.1412ms | 53.5498μs | 18.6742 KOps/s | 18.8682 KOps/s | $\color{#d91a1a}-1.03\\%$ | | test_select_nested | 0.1145ms | 60.1062μs | 16.6372 KOps/s | 16.8219 KOps/s | $\color{#d91a1a}-1.10\\%$ | | test_exclude_nested | 0.1447ms | 76.7432μs | 13.0305 KOps/s | 12.7498 KOps/s | $\color{#35bf28}+2.20\\%$ | | test_empty[True] | 0.7137ms | 0.3202ms | 3.1227 KOps/s | 3.0167 KOps/s | $\color{#35bf28}+3.51\\%$ | | test_empty[False] | 9.1812μs | 1.1674μs | 856.5968 KOps/s | 838.7343 KOps/s | $\color{#35bf28}+2.13\\%$ | | test_unbind_speed | 0.6504ms | 0.3039ms | 3.2902 KOps/s | 3.2170 KOps/s | $\color{#35bf28}+2.27\\%$ | | test_unbind_speed_stack0 | 0.4455ms | 0.2997ms | 3.3372 KOps/s | 3.3360 KOps/s | $\color{#35bf28}+0.03\\%$ | | test_unbind_speed_stack1 | 89.6110ms | 0.7915ms | 1.2635 KOps/s | 1.3833 KOps/s | $\textbf{\color{#d91a1a}-8.66\\%}$ | | test_split | 88.1828ms | 2.1549ms | 464.0621 Ops/s | 464.0866 Ops/s | $-0.01\\%$ | | test_chunk | 95.1062ms | 2.1859ms | 457.4836 Ops/s | 464.0971 Ops/s | $\color{#d91a1a}-1.43\\%$ | | test_creation[device0] | 0.2155ms | 0.1168ms | 8.5589 KOps/s | 8.2048 KOps/s | $\color{#35bf28}+4.32\\%$ | | test_creation_from_tensor | 4.4703ms | 0.1203ms | 8.3116 KOps/s | 8.3691 KOps/s | $\color{#d91a1a}-0.69\\%$ | | test_add_one[memmap_tensor0] | 0.1667ms | 8.1450μs | 122.7746 KOps/s | 124.5626 KOps/s | $\color{#d91a1a}-1.44\\%$ | | test_contiguous[memmap_tensor0] | 18.5340μs | 2.0177μs | 495.6214 KOps/s | 493.6206 KOps/s | $\color{#35bf28}+0.41\\%$ | | test_stack[memmap_tensor0] | 53.6900μs | 5.7079μs | 175.1967 KOps/s | 172.3754 KOps/s | $\color{#35bf28}+1.64\\%$ | | test_memmaptd_index | 1.1161ms | 0.4081ms | 2.4506 KOps/s | 2.4564 KOps/s | $\color{#d91a1a}-0.24\\%$ | | test_memmaptd_index_astensor | 0.7524ms | 0.4878ms | 2.0500 KOps/s | 1.9977 KOps/s | $\color{#35bf28}+2.62\\%$ | | test_memmaptd_index_op | 3.0862ms | 1.0572ms | 945.8967 Ops/s | 979.3310 Ops/s | $\color{#d91a1a}-3.41\\%$ | | test_serialize_model | 0.1244s | 0.1196s | 8.3636 Ops/s | 8.4390 Ops/s | $\color{#d91a1a}-0.89\\%$ | | test_serialize_model_pickle | 0.4298s | 0.3918s | 2.5520 Ops/s | 2.4871 Ops/s | $\color{#35bf28}+2.61\\%$ | | test_serialize_weights | 0.1250s | 0.1153s | 8.6742 Ops/s | 7.4177 Ops/s | $\textbf{\color{#35bf28}+16.94\\%}$ | | test_serialize_weights_returnearly | 0.1673s | 0.1580s | 6.3311 Ops/s | 6.3918 Ops/s | $\color{#d91a1a}-0.95\\%$ | | test_serialize_weights_pickle | 0.4867s | 0.4126s | 2.4238 Ops/s | 2.4860 Ops/s | $\color{#d91a1a}-2.50\\%$ | | test_serialize_weights_filesystem | 0.2400s | 0.1514s | 6.6033 Ops/s | 7.1273 Ops/s | $\textbf{\color{#d91a1a}-7.35\\%}$ | | test_serialize_model_filesystem | 0.1542s | 0.1450s | 6.8978 Ops/s | 6.6993 Ops/s | $\color{#35bf28}+2.96\\%$ | | test_reshape_pytree | 82.7950μs | 39.1691μs | 25.5303 KOps/s | 24.4974 KOps/s | $\color{#35bf28}+4.22\\%$ | | test_reshape_td | 0.1069ms | 46.7695μs | 21.3814 KOps/s | 20.9394 KOps/s | $\color{#35bf28}+2.11\\%$ | | test_view_pytree | 0.1050ms | 39.9654μs | 25.0217 KOps/s | 25.2348 KOps/s | $\color{#d91a1a}-0.84\\%$ | | test_view_td | 0.1078ms | 54.2109μs | 18.4465 KOps/s | 18.9111 KOps/s | $\color{#d91a1a}-2.46\\%$ | | test_unbind_pytree | 76.7640μs | 36.7781μs | 27.1901 KOps/s | 26.6976 KOps/s | $\color{#35bf28}+1.84\\%$ | | test_unbind_td | 0.3212ms | 45.4655μs | 21.9947 KOps/s | 21.8904 KOps/s | $\color{#35bf28}+0.48\\%$ | | test_split_pytree | 94.9980μs | 39.5754μs | 25.2682 KOps/s | 24.3452 KOps/s | $\color{#35bf28}+3.79\\%$ | | test_split_td | 0.5377ms | 58.2939μs | 17.1545 KOps/s | 17.0630 KOps/s | $\color{#35bf28}+0.54\\%$ | | test_add_pytree | 0.1154ms | 46.5770μs | 21.4698 KOps/s | 21.5166 KOps/s | $\color{#d91a1a}-0.22\\%$ | | test_add_td | 0.1819ms | 79.5402μs | 12.5723 KOps/s | 12.1537 KOps/s | $\color{#35bf28}+3.44\\%$ | | test_compile_add_one_nested[tensordict-compile] | 0.1185ms | 55.7589μs | 17.9344 KOps/s | 18.3443 KOps/s | $\color{#d91a1a}-2.23\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.3180ms | 0.1843ms | 5.4257 KOps/s | 5.1187 KOps/s | $\textbf{\color{#35bf28}+6.00\\%}$ | | test_compile_add_one_nested[pytree-compile] | 0.1440ms | 55.4698μs | 18.0278 KOps/s | 18.2481 KOps/s | $\color{#d91a1a}-1.21\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.6573ms | 0.1488ms | 6.7198 KOps/s | 6.5662 KOps/s | $\color{#35bf28}+2.34\\%$ | | test_compile_copy_nested[tensordict-compile] | 68.9690μs | 20.5070μs | 48.7639 KOps/s | 47.3720 KOps/s | $\color{#35bf28}+2.94\\%$ | | test_compile_copy_nested[tensordict-eager] | 0.1248ms | 63.7831μs | 15.6781 KOps/s | 15.1954 KOps/s | $\color{#35bf28}+3.18\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1532ms | 79.4313μs | 12.5895 KOps/s | 12.5172 KOps/s | $\color{#35bf28}+0.58\\%$ | | test_compile_copy_nested[pytree-eager] | 0.2694ms | 72.1414μs | 13.8617 KOps/s | 13.5911 KOps/s | $\color{#35bf28}+1.99\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.3676ms | 0.1753ms | 5.7050 KOps/s | 5.6935 KOps/s | $\color{#35bf28}+0.20\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.4230ms | 0.1917ms | 5.2158 KOps/s | 5.0662 KOps/s | $\color{#35bf28}+2.95\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.1090ms | 39.6628μs | 25.2125 KOps/s | 25.2386 KOps/s | $\color{#d91a1a}-0.10\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 0.4721ms | 69.5276μs | 14.3828 KOps/s | 14.2409 KOps/s | $\color{#35bf28}+1.00\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.3389ms | 0.1752ms | 5.7064 KOps/s | 5.7647 KOps/s | $\color{#d91a1a}-1.01\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.5258ms | 0.2986ms | 3.3488 KOps/s | 3.3455 KOps/s | $\color{#35bf28}+0.10\\%$ | | test_compile_add_self_flat[tensordict-eager] | 0.4828ms | 0.2070ms | 4.8303 KOps/s | 4.6615 KOps/s | $\color{#35bf28}+3.62\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.2520ms | 0.1733ms | 5.7691 KOps/s | 5.6776 KOps/s | $\color{#35bf28}+1.61\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.2753ms | 61.7008μs | 16.2072 KOps/s | 15.6269 KOps/s | $\color{#35bf28}+3.71\\%$ | | test_compile_add_self_flat[tensorclass-compile] | 0.1171ms | 40.4590μs | 24.7164 KOps/s | 23.8690 KOps/s | $\color{#35bf28}+3.55\\%$ | | test_compile_add_self_flat[pytree-eager] | 0.3402ms | 0.2415ms | 4.1400 KOps/s | 4.0263 KOps/s | $\color{#35bf28}+2.82\\%$ | | test_compile_add_self_flat[pytree-compile] | 0.2711ms | 0.1720ms | 5.8138 KOps/s | 5.7024 KOps/s | $\color{#35bf28}+1.95\\%$ | | test_compile_copy_flat[tensordict-compile] | 0.1915ms | 0.1070ms | 9.3492 KOps/s | 9.0825 KOps/s | $\color{#35bf28}+2.94\\%$ | | test_compile_copy_flat[tensordict-eager] | 0.1138ms | 55.0483μs | 18.1659 KOps/s | 16.9990 KOps/s | $\textbf{\color{#35bf28}+6.86\\%}$ | | test_compile_copy_flat[pytree-compile] | 0.1552ms | 78.2456μs | 12.7803 KOps/s | 12.0898 KOps/s | $\textbf{\color{#35bf28}+5.71\\%}$ | | test_compile_copy_flat[pytree-eager] | 0.1333ms | 70.3908μs | 14.2064 KOps/s | 13.4669 KOps/s | $\textbf{\color{#35bf28}+5.49\\%}$ | | test_compile_assign_and_add[tensordict-compile] | 0.2764ms | 0.1966ms | 5.0857 KOps/s | 5.2762 KOps/s | $\color{#d91a1a}-3.61\\%$ | | test_compile_assign_and_add[tensordict-eager] | 1.8455ms | 1.5921ms | 628.1134 Ops/s | 586.1264 Ops/s | $\textbf{\color{#35bf28}+7.16\\%}$ | | test_compile_assign_and_add[pytree-compile] | 0.3074ms | 0.1926ms | 5.1926 KOps/s | 5.1342 KOps/s | $\color{#35bf28}+1.14\\%$ | | test_compile_assign_and_add[pytree-eager] | 1.8304ms | 1.1105ms | 900.5154 Ops/s | 888.6729 Ops/s | $\color{#35bf28}+1.33\\%$ | | test_compile_assign_and_add_stack[compile] | 0.6587ms | 0.4211ms | 2.3750 KOps/s | 2.3531 KOps/s | $\color{#35bf28}+0.93\\%$ | | test_compile_assign_and_add_stack[eager] | 5.0871ms | 3.7813ms | 264.4626 Ops/s | 267.9703 Ops/s | $\color{#d91a1a}-1.31\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 89.7880μs | 33.9595μs | 29.4469 KOps/s | 30.0942 KOps/s | $\color{#d91a1a}-2.15\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 1.1767ms | 49.7305μs | 20.1084 KOps/s | 20.6256 KOps/s | $\color{#d91a1a}-2.51\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 80.6010μs | 30.0038μs | 33.3291 KOps/s | 33.9611 KOps/s | $\color{#d91a1a}-1.86\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 76.0720μs | 30.9896μs | 32.2689 KOps/s | 31.6407 KOps/s | $\color{#35bf28}+1.99\\%$ | | test_compile_indexing[tensor-pytree-compile] | 0.1246ms | 29.7264μs | 33.6401 KOps/s | 32.4398 KOps/s | $\color{#35bf28}+3.70\\%$ | | test_compile_indexing[tensor-pytree-eager] | 96.2300μs | 31.9529μs | 31.2961 KOps/s | 32.9508 KOps/s | $\textbf{\color{#d91a1a}-5.02\\%}$ | | test_compile_indexing[slice-tensordict-compile] | 0.1480ms | 73.1510μs | 13.6704 KOps/s | 13.8546 KOps/s | $\color{#d91a1a}-1.33\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.5306ms | 27.8276μs | 35.9356 KOps/s | 36.0577 KOps/s | $\color{#d91a1a}-0.34\\%$ | | test_compile_indexing[slice-tensorclass-compile] | 0.1639ms | 67.8613μs | 14.7359 KOps/s | 14.8535 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 76.8230μs | 25.0061μs | 39.9902 KOps/s | 39.6574 KOps/s | $\color{#35bf28}+0.84\\%$ | | test_compile_indexing[slice-pytree-compile] | 0.1414ms | 67.7809μs | 14.7534 KOps/s | 15.0386 KOps/s | $\color{#d91a1a}-1.90\\%$ | | test_compile_indexing[slice-pytree-eager] | 77.5550μs | 24.6148μs | 40.6260 KOps/s | 41.0831 KOps/s | $\color{#d91a1a}-1.11\\%$ | | test_compile_indexing[int-tensordict-compile] | 0.1411ms | 72.1420μs | 13.8616 KOps/s | 14.0304 KOps/s | $\color{#d91a1a}-1.20\\%$ | | test_compile_indexing[int-tensordict-eager] | 1.0276ms | 27.9218μs | 35.8143 KOps/s | 35.5906 KOps/s | $\color{#35bf28}+0.63\\%$ | | test_compile_indexing[int-tensorclass-compile] | 0.1454ms | 67.7161μs | 14.7675 KOps/s | 15.1133 KOps/s | $\color{#d91a1a}-2.29\\%$ | | test_compile_indexing[int-tensorclass-eager] | 68.8180μs | 25.1132μs | 39.8196 KOps/s | 41.6847 KOps/s | $\color{#d91a1a}-4.47\\%$ | | test_compile_indexing[int-pytree-compile] | 0.1533ms | 67.8562μs | 14.7371 KOps/s | 15.1844 KOps/s | $\color{#d91a1a}-2.95\\%$ | | test_compile_indexing[int-pytree-eager] | 0.5001ms | 24.8928μs | 40.1723 KOps/s | 40.8316 KOps/s | $\color{#d91a1a}-1.61\\%$ | | test_mod_add[eager] | 90.7930μs | 24.4674μs | 40.8707 KOps/s | 41.8768 KOps/s | $\color{#d91a1a}-2.40\\%$ | | test_mod_add[compile] | 89.1470μs | 38.4940μs | 25.9780 KOps/s | 25.7309 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_mod_add[compile-overhead] | 0.1059ms | 38.7781μs | 25.7878 KOps/s | 26.2779 KOps/s | $\color{#d91a1a}-1.87\\%$ | | test_mod_wrap[eager] | 0.4102ms | 0.2030ms | 4.9258 KOps/s | 4.9513 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_mod_wrap[compile] | 1.5711ms | 0.2261ms | 4.4226 KOps/s | 4.2929 KOps/s | $\color{#35bf28}+3.02\\%$ | | test_mod_wrap[compile-overhead] | 0.4286ms | 0.2218ms | 4.5080 KOps/s | 4.3369 KOps/s | $\color{#35bf28}+3.95\\%$ | | test_mod_wrap_and_backward[eager] | 12.5199ms | 11.0335ms | 90.6329 Ops/s | 83.2980 Ops/s | $\textbf{\color{#35bf28}+8.81\\%}$ | | test_mod_wrap_and_backward[compile] | 14.4707ms | 11.5380ms | 86.6703 Ops/s | 84.0433 Ops/s | $\color{#35bf28}+3.13\\%$ | | test_mod_wrap_and_backward[compile-overhead] | 16.4656ms | 11.9658ms | 83.5718 Ops/s | 78.6951 Ops/s | $\textbf{\color{#35bf28}+6.20\\%}$ | | test_seq_add[eager] | 0.1985ms | 87.2336μs | 11.4635 KOps/s | 11.9316 KOps/s | $\color{#d91a1a}-3.92\\%$ | | test_seq_add[compile] | 0.1575ms | 60.7460μs | 16.4620 KOps/s | 16.1001 KOps/s | $\color{#35bf28}+2.25\\%$ | | test_seq_add[compile-overhead] | 0.2855ms | 63.5725μs | 15.7301 KOps/s | 16.0530 KOps/s | $\color{#d91a1a}-2.01\\%$ | | test_seq_wrap[eager] | 0.6563ms | 0.3789ms | 2.6395 KOps/s | 2.7336 KOps/s | $\color{#d91a1a}-3.44\\%$ | | test_seq_wrap[compile] | 0.6605ms | 0.2654ms | 3.7680 KOps/s | 3.7322 KOps/s | $\color{#35bf28}+0.96\\%$ | | test_seq_wrap[compile-overhead] | 0.4669ms | 0.2602ms | 3.8431 KOps/s | 3.7343 KOps/s | $\color{#35bf28}+2.91\\%$ | | test_func_call_runtime[False-eager] | 0.9958ms | 0.5249ms | 1.9050 KOps/s | 1.9276 KOps/s | $\color{#d91a1a}-1.17\\%$ | | test_func_call_runtime[False-compile] | 0.7863ms | 0.4974ms | 2.0104 KOps/s | 1.9769 KOps/s | $\color{#35bf28}+1.70\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.9006ms | 0.4972ms | 2.0114 KOps/s | 1.9614 KOps/s | $\color{#35bf28}+2.55\\%$ | | test_func_call_runtime[True-eager] | 1.3441ms | 0.7428ms | 1.3462 KOps/s | 1.3241 KOps/s | $\color{#35bf28}+1.67\\%$ | | test_func_call_runtime[True-compile] | 0.9432ms | 0.5087ms | 1.9658 KOps/s | 1.9222 KOps/s | $\color{#35bf28}+2.27\\%$ | | test_func_call_runtime[True-compile-overhead] | 0.6812ms | 0.5075ms | 1.9706 KOps/s | 1.9237 KOps/s | $\color{#35bf28}+2.43\\%$ | | test_func_call_cm_runtime[False-eager] | 0.8467ms | 0.5203ms | 1.9219 KOps/s | 1.9578 KOps/s | $\color{#d91a1a}-1.84\\%$ | | test_func_call_cm_runtime[False-compile] | 0.8164ms | 0.4917ms | 2.0340 KOps/s | 1.9822 KOps/s | $\color{#35bf28}+2.61\\%$ | | test_func_call_cm_runtime[False-compile-overhead] | 0.6128ms | 0.4948ms | 2.0211 KOps/s | 1.9773 KOps/s | $\color{#35bf28}+2.21\\%$ | | test_func_call_cm_runtime[True-eager] | 1.4864ms | 0.8916ms | 1.1216 KOps/s | 1.1166 KOps/s | $\color{#35bf28}+0.45\\%$ | | test_func_call_cm_runtime[True-compile] | 0.9690ms | 0.8279ms | 1.2079 KOps/s | 1.1917 KOps/s | $\color{#35bf28}+1.36\\%$ | | test_func_call_cm_runtime[True-compile-overhead] | 1.3249ms | 0.8289ms | 1.2064 KOps/s | 1.2074 KOps/s | $\color{#d91a1a}-0.08\\%$ | | test_distributed | 0.2800ms | 0.1290ms | 7.7537 KOps/s | 7.6199 KOps/s | $\color{#35bf28}+1.75\\%$ | | test_tdmodule | 78.7370μs | 17.6701μs | 56.5928 KOps/s | 63.3632 KOps/s | $\textbf{\color{#d91a1a}-10.69\\%}$ | | test_tdmodule_dispatch | 59.9620μs | 35.9816μs | 27.7920 KOps/s | 29.8217 KOps/s | $\textbf{\color{#d91a1a}-6.81\\%}$ | | test_tdseq | 35.1660μs | 19.6820μs | 50.8079 KOps/s | 54.7690 KOps/s | $\textbf{\color{#d91a1a}-7.23\\%}$ | | test_tdseq_dispatch | 0.1259ms | 40.2088μs | 24.8702 KOps/s | 26.5392 KOps/s | $\textbf{\color{#d91a1a}-6.29\\%}$ | | test_instantiation_functorch | 3.8269ms | 1.6680ms | 599.5138 Ops/s | 604.0178 Ops/s | $\color{#d91a1a}-0.75\\%$ | | test_instantiation_td | 1.8238ms | 1.1631ms | 859.7720 Ops/s | 824.6138 Ops/s | $\color{#35bf28}+4.26\\%$ | | test_exec_functorch | 0.3434ms | 0.1831ms | 5.4609 KOps/s | 5.5572 KOps/s | $\color{#d91a1a}-1.73\\%$ | | test_exec_functional_call | 0.4311ms | 0.1724ms | 5.8003 KOps/s | 6.0548 KOps/s | $\color{#d91a1a}-4.20\\%$ | | test_exec_td | 0.4189ms | 0.1754ms | 5.7016 KOps/s | 5.9636 KOps/s | $\color{#d91a1a}-4.39\\%$ | | test_exec_td_decorator | 1.0853ms | 0.2278ms | 4.3893 KOps/s | 4.4735 KOps/s | $\color{#d91a1a}-1.88\\%$ | | test_vmap_mlp_speed[True-True] | 0.8594ms | 0.6115ms | 1.6354 KOps/s | 1.6851 KOps/s | $\color{#d91a1a}-2.95\\%$ | | test_vmap_mlp_speed[True-False] | 0.8199ms | 0.5982ms | 1.6716 KOps/s | 1.7071 KOps/s | $\color{#d91a1a}-2.08\\%$ | | test_vmap_mlp_speed[False-True] | 0.9387ms | 0.4945ms | 2.0221 KOps/s | 2.0435 KOps/s | $\color{#d91a1a}-1.04\\%$ | | test_vmap_mlp_speed[False-False] | 0.8002ms | 0.4954ms | 2.0185 KOps/s | 1.9983 KOps/s | $\color{#35bf28}+1.01\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.4652ms | 0.6495ms | 1.5397 KOps/s | 1.5505 KOps/s | $\color{#d91a1a}-0.70\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.8507ms | 0.6489ms | 1.5412 KOps/s | 1.5558 KOps/s | $\color{#d91a1a}-0.94\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8426ms | 0.5384ms | 1.8575 KOps/s | 1.8482 KOps/s | $\color{#35bf28}+0.50\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.7598ms | 0.5390ms | 1.8551 KOps/s | 1.8562 KOps/s | $\color{#d91a1a}-0.06\\%$ | | test_to_module_speed[True] | 2.3055ms | 1.3227ms | 756.0242 Ops/s | 741.5466 Ops/s | $\color{#35bf28}+1.95\\%$ | | test_to_module_speed[False] | 1.8683ms | 1.2837ms | 778.9819 Ops/s | 748.7184 Ops/s | $\color{#35bf28}+4.04\\%$ | | test_tc_init | 80.1300μs | 43.3489μs | 23.0686 KOps/s | 23.9356 KOps/s | $\color{#d91a1a}-3.62\\%$ | | test_tc_init_nested | 0.4202ms | 94.8087μs | 10.5476 KOps/s | 11.7852 KOps/s | $\textbf{\color{#d91a1a}-10.50\\%}$ | | test_tc_first_layer_tensor | 24.0250μs | 1.4193μs | 704.5896 KOps/s | 649.0673 KOps/s | $\textbf{\color{#35bf28}+8.55\\%}$ | | test_tc_first_layer_nontensor | 49.0620μs | 4.2470μs | 235.4580 KOps/s | 222.4921 KOps/s | $\textbf{\color{#35bf28}+5.83\\%}$ | | test_tc_second_layer_tensor | 67.7240μs | 2.6358μs | 379.3920 KOps/s | 356.5819 KOps/s | $\textbf{\color{#35bf28}+6.40\\%}$ | | test_tc_second_layer_nontensor | 42.6830μs | 5.4546μs | 183.3317 KOps/s | 172.3188 KOps/s | $\textbf{\color{#35bf28}+6.39\\%}$ | | test_unbind | 0.4485s | 13.4001ms | 74.6264 Ops/s | 70.9855 Ops/s | $\textbf{\color{#35bf28}+5.13\\%}$ | | test_full_like | 8.4922ms | 7.0213ms | 142.4229 Ops/s | 129.4597 Ops/s | $\textbf{\color{#35bf28}+10.01\\%}$ | | test_zeros_like | 12.3732ms | 6.4386ms | 155.3134 Ops/s | 137.9929 Ops/s | $\textbf{\color{#35bf28}+12.55\\%}$ | | test_ones_like | 15.0390ms | 7.7423ms | 129.1598 Ops/s | 139.2847 Ops/s | $\textbf{\color{#d91a1a}-7.27\\%}$ | | test_clone | 17.1076ms | 9.4334ms | 106.0065 Ops/s | 111.4172 Ops/s | $\color{#d91a1a}-4.86\\%$ | | test_squeeze | 69.1590μs | 12.6663μs | 78.9498 KOps/s | 76.9918 KOps/s | $\color{#35bf28}+2.54\\%$ | | test_unsqueeze | 0.3647ms | 95.3069μs | 10.4924 KOps/s | 10.5126 KOps/s | $\color{#d91a1a}-0.19\\%$ | | test_split | 0.3522ms | 0.1998ms | 5.0054 KOps/s | 5.0042 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_permute | 0.3542ms | 0.2177ms | 4.5934 KOps/s | 4.4829 KOps/s | $\color{#35bf28}+2.46\\%$ | | test_stack | 34.0708ms | 24.2134ms | 41.2995 Ops/s | 39.8914 Ops/s | $\color{#35bf28}+3.53\\%$ | | test_cat | 29.6181ms | 24.3035ms | 41.1464 Ops/s | 40.8694 Ops/s | $\color{#35bf28}+0.68\\%$ |
github-actions[bot] commented 1 month ago

$\color{#D29922}\textsf{\Large\⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 225. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}34$.

Expand to view detailed results | Name | Max | Mean | Ops | Ops on Repo `HEAD` | Change | | -------------------------------------------------- | --------- | --------- | --------------- | ------------------ | ----------------------------------- | | test_plain_set_nested | 0.5779ms | 17.6768μs | 56.5712 KOps/s | 60.5595 KOps/s | $\textbf{\color{#d91a1a}-6.59\\%}$ | | test_plain_set_stack_nested | 46.8810μs | 17.6338μs | 56.7093 KOps/s | 60.5960 KOps/s | $\textbf{\color{#d91a1a}-6.41\\%}$ | | test_plain_set_nested_inplace | 84.6420μs | 18.6488μs | 53.6227 KOps/s | 57.0114 KOps/s | $\textbf{\color{#d91a1a}-5.94\\%}$ | | test_plain_set_stack_nested_inplace | 57.5720μs | 18.6549μs | 53.6051 KOps/s | 57.0598 KOps/s | $\textbf{\color{#d91a1a}-6.05\\%}$ | | test_items | 18.7100μs | 4.7078μs | 212.4112 KOps/s | 211.9880 KOps/s | $\color{#35bf28}+0.20\\%$ | | test_items_nested | 0.3962ms | 0.3658ms | 2.7335 KOps/s | 2.7028 KOps/s | $\color{#35bf28}+1.14\\%$ | | test_items_nested_locked | 0.3892ms | 0.3657ms | 2.7343 KOps/s | 2.7291 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_items_nested_leaf | 0.1027ms | 85.0852μs | 11.7529 KOps/s | 11.9017 KOps/s | $\color{#d91a1a}-1.25\\%$ | | test_items_stack_nested | 0.4235ms | 0.3708ms | 2.6969 KOps/s | 2.7599 KOps/s | $\color{#d91a1a}-2.28\\%$ | | test_items_stack_nested_leaf | 0.1052ms | 85.7074μs | 11.6676 KOps/s | 11.7966 KOps/s | $\color{#d91a1a}-1.09\\%$ | | test_items_stack_nested_locked | 0.4264ms | 0.3675ms | 2.7211 KOps/s | 2.7428 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_keys | 22.4510μs | 4.3796μs | 228.3336 KOps/s | 229.5540 KOps/s | $\color{#d91a1a}-0.53\\%$ | | test_keys_nested | 89.0520μs | 66.4766μs | 15.0429 KOps/s | 15.3799 KOps/s | $\color{#d91a1a}-2.19\\%$ | | test_keys_nested_locked | 0.7936ms | 73.4926μs | 13.6068 KOps/s | 13.7917 KOps/s | $\color{#d91a1a}-1.34\\%$ | | test_keys_nested_leaf | 76.7910μs | 58.0945μs | 17.2133 KOps/s | 17.9287 KOps/s | $\color{#d91a1a}-3.99\\%$ | | test_keys_stack_nested | 96.3220μs | 67.9942μs | 14.7071 KOps/s | 14.9863 KOps/s | $\color{#d91a1a}-1.86\\%$ | | test_keys_stack_nested_leaf | 85.9410μs | 59.0717μs | 16.9286 KOps/s | 17.4276 KOps/s | $\color{#d91a1a}-2.86\\%$ | | test_keys_stack_nested_locked | 97.7120μs | 74.2258μs | 13.4724 KOps/s | 13.9011 KOps/s | $\color{#d91a1a}-3.08\\%$ | | test_values | 9.7567μs | 1.7791μs | 562.0662 KOps/s | 566.6037 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_values_nested | 63.4110μs | 34.0970μs | 29.3281 KOps/s | 29.4786 KOps/s | $\color{#d91a1a}-0.51\\%$ | | test_values_nested_locked | 56.0310μs | 36.3457μs | 27.5136 KOps/s | 27.8895 KOps/s | $\color{#d91a1a}-1.35\\%$ | | test_values_nested_leaf | 53.7010μs | 30.2134μs | 33.0979 KOps/s | 33.3046 KOps/s | $\color{#d91a1a}-0.62\\%$ | | test_values_stack_nested | 51.4610μs | 34.4215μs | 29.0516 KOps/s | 29.2077 KOps/s | $\color{#d91a1a}-0.53\\%$ | | test_values_stack_nested_leaf | 58.2110μs | 30.2915μs | 33.0125 KOps/s | 33.0998 KOps/s | $\color{#d91a1a}-0.26\\%$ | | test_values_stack_nested_locked | 59.1610μs | 36.0502μs | 27.7391 KOps/s | 27.8639 KOps/s | $\color{#d91a1a}-0.45\\%$ | | test_membership | 1.6160μs | 0.5723μs | 1.7474 MOps/s | 1.8279 MOps/s | $\color{#d91a1a}-4.41\\%$ | | test_membership_nested | 24.2600μs | 2.0503μs | 487.7232 KOps/s | 512.0006 KOps/s | $\color{#d91a1a}-4.74\\%$ | | test_membership_nested_leaf | 10.2905μs | 1.9913μs | 502.1884 KOps/s | 498.0599 KOps/s | $\color{#35bf28}+0.83\\%$ | | test_membership_stacked_nested | 19.4510μs | 2.0436μs | 489.3296 KOps/s | 492.2615 KOps/s | $\color{#d91a1a}-0.60\\%$ | | test_membership_stacked_nested_leaf | 15.9110μs | 2.0519μs | 487.3580 KOps/s | 496.1831 KOps/s | $\color{#d91a1a}-1.78\\%$ | | test_membership_nested_last | 31.4210μs | 3.0000μs | 333.3386 KOps/s | 341.5020 KOps/s | $\color{#d91a1a}-2.39\\%$ | | test_membership_nested_leaf_last | 18.0400μs | 3.0097μs | 332.2645 KOps/s | 346.0077 KOps/s | $\color{#d91a1a}-3.97\\%$ | | test_membership_stacked_nested_last | 21.4900μs | 2.9897μs | 334.4871 KOps/s | 334.7363 KOps/s | $\color{#d91a1a}-0.07\\%$ | | test_membership_stacked_nested_leaf_last | 32.3100μs | 2.9916μs | 334.2662 KOps/s | 340.7868 KOps/s | $\color{#d91a1a}-1.91\\%$ | | test_nested_getleaf | 30.9700μs | 7.9386μs | 125.9661 KOps/s | 125.4571 KOps/s | $\color{#35bf28}+0.41\\%$ | | test_nested_get | 41.1210μs | 7.5247μs | 132.8949 KOps/s | 133.9190 KOps/s | $\color{#d91a1a}-0.76\\%$ | | test_stacked_getleaf | 33.8400μs | 8.0219μs | 124.6592 KOps/s | 126.0195 KOps/s | $\color{#d91a1a}-1.08\\%$ | | test_stacked_get | 24.8800μs | 7.5494μs | 132.4615 KOps/s | 132.5427 KOps/s | $\color{#d91a1a}-0.06\\%$ | | test_nested_getitemleaf | 29.8810μs | 8.0937μs | 123.5527 KOps/s | 123.4969 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_nested_getitem | 27.0000μs | 7.6505μs | 130.7111 KOps/s | 129.9761 KOps/s | $\color{#35bf28}+0.57\\%$ | | test_stacked_getitemleaf | 35.7510μs | 8.1317μs | 122.9756 KOps/s | 123.4790 KOps/s | $\color{#d91a1a}-0.41\\%$ | | test_stacked_getitem | 38.4810μs | 7.6783μs | 130.2374 KOps/s | 129.9875 KOps/s | $\color{#35bf28}+0.19\\%$ | | test_lock_nested | 1.2741ms | 0.4806ms | 2.0807 KOps/s | 2.1102 KOps/s | $\color{#d91a1a}-1.40\\%$ | | test_lock_stack_nested | 0.4891ms | 0.4446ms | 2.2492 KOps/s | 2.2852 KOps/s | $\color{#d91a1a}-1.57\\%$ | | test_unlock_nested | 0.8300ms | 0.4045ms | 2.4722 KOps/s | 2.5344 KOps/s | $\color{#d91a1a}-2.46\\%$ | | test_unlock_stack_nested | 0.3896ms | 0.3654ms | 2.7365 KOps/s | 2.7893 KOps/s | $\color{#d91a1a}-1.90\\%$ | | test_flatten_speed | 0.1910ms | 0.1042ms | 9.5980 KOps/s | 9.7785 KOps/s | $\color{#d91a1a}-1.85\\%$ | | test_unflatten_speed | 0.3253ms | 0.2872ms | 3.4818 KOps/s | 3.4959 KOps/s | $\color{#d91a1a}-0.40\\%$ | | test_common_ops | 1.6119ms | 1.3626ms | 733.9098 Ops/s | 764.7570 Ops/s | $\color{#d91a1a}-4.03\\%$ | | test_creation | 14.4810μs | 1.6632μs | 601.2379 KOps/s | 602.2001 KOps/s | $\color{#d91a1a}-0.16\\%$ | | test_creation_empty | 44.7210μs | 18.4340μs | 54.2476 KOps/s | 61.3900 KOps/s | $\textbf{\color{#d91a1a}-11.63\\%}$ | | test_creation_nested_1 | 40.5410μs | 20.6242μs | 48.4867 KOps/s | 54.7298 KOps/s | $\textbf{\color{#d91a1a}-11.41\\%}$ | | test_creation_nested_2 | 45.4410μs | 23.1020μs | 43.2863 KOps/s | 48.2807 KOps/s | $\textbf{\color{#d91a1a}-10.34\\%}$ | | test_clone | 0.1778ms | 30.2068μs | 33.1052 KOps/s | 32.9640 KOps/s | $\color{#35bf28}+0.43\\%$ | | test_getitem[int] | 1.1128ms | 17.6843μs | 56.5474 KOps/s | 58.3814 KOps/s | $\color{#d91a1a}-3.14\\%$ | | test_getitem[slice_int] | 0.1461ms | 30.9748μs | 32.2843 KOps/s | 33.9214 KOps/s | $\color{#d91a1a}-4.83\\%$ | | test_getitem[range] | 0.2658ms | 0.1166ms | 8.5780 KOps/s | 8.6530 KOps/s | $\color{#d91a1a}-0.87\\%$ | | test_getitem[tuple] | 0.1344ms | 25.9943μs | 38.4699 KOps/s | 39.6645 KOps/s | $\color{#d91a1a}-3.01\\%$ | | test_getitem[list] | 0.2506ms | 0.1042ms | 9.5987 KOps/s | 9.4370 KOps/s | $\color{#35bf28}+1.71\\%$ | | test_setitem_dim[int] | 74.2210μs | 56.3662μs | 17.7411 KOps/s | 18.7775 KOps/s | $\textbf{\color{#d91a1a}-5.52\\%}$ | | test_setitem_dim[slice_int] | 0.1228ms | 86.8637μs | 11.5123 KOps/s | 12.7036 KOps/s | $\textbf{\color{#d91a1a}-9.38\\%}$ | | test_setitem_dim[range] | 0.1756ms | 0.1510ms | 6.6229 KOps/s | 7.0202 KOps/s | $\textbf{\color{#d91a1a}-5.66\\%}$ | | test_setitem_dim[tuple] | 0.1066ms | 79.4255μs | 12.5904 KOps/s | 14.0734 KOps/s | $\textbf{\color{#d91a1a}-10.54\\%}$ | | test_setitem | 0.2070ms | 48.6314μs | 20.5629 KOps/s | 22.7891 KOps/s | $\textbf{\color{#d91a1a}-9.77\\%}$ | | test_set | 0.2178ms | 46.8713μs | 21.3350 KOps/s | 23.6729 KOps/s | $\textbf{\color{#d91a1a}-9.88\\%}$ | | test_set_shared | 0.3783ms | 54.3204μs | 18.4093 KOps/s | 18.2034 KOps/s | $\color{#35bf28}+1.13\\%$ | | test_update | 0.2423ms | 55.3196μs | 18.0768 KOps/s | 19.8120 KOps/s | $\textbf{\color{#d91a1a}-8.76\\%}$ | | test_update_nested | 0.2039ms | 60.9498μs | 16.4069 KOps/s | 17.3295 KOps/s | $\textbf{\color{#d91a1a}-5.32\\%}$ | | test_update__nested | 0.2332ms | 62.9293μs | 15.8908 KOps/s | 16.1632 KOps/s | $\color{#d91a1a}-1.68\\%$ | | test_set_nested | 0.2071ms | 50.4159μs | 19.8350 KOps/s | 22.3781 KOps/s | $\textbf{\color{#d91a1a}-11.36\\%}$ | | test_set_nested_new | 0.1991ms | 53.8313μs | 18.5765 KOps/s | 19.9837 KOps/s | $\textbf{\color{#d91a1a}-7.04\\%}$ | | test_select | 0.1034ms | 69.0836μs | 14.4752 KOps/s | 14.9755 KOps/s | $\color{#d91a1a}-3.34\\%$ | | test_select_nested | 71.7820μs | 51.3999μs | 19.4553 KOps/s | 19.4613 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_exclude_nested | 94.1320μs | 71.0400μs | 14.0766 KOps/s | 14.3084 KOps/s | $\color{#d91a1a}-1.62\\%$ | | test_empty[True] | 0.3562ms | 0.2866ms | 3.4888 KOps/s | 3.5121 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_empty[False] | 2.4321μs | 0.8852μs | 1.1297 MOps/s | 1.1592 MOps/s | $\color{#d91a1a}-2.54\\%$ | | test_to | 61.6010μs | 41.5538μs | 24.0652 KOps/s | 25.1756 KOps/s | $\color{#d91a1a}-4.41\\%$ | | test_to_nonblocking | 0.1466ms | 28.2942μs | 35.3429 KOps/s | 38.7720 KOps/s | $\textbf{\color{#d91a1a}-8.84\\%}$ | | test_unbind_speed | 0.3599ms | 0.3132ms | 3.1933 KOps/s | 3.2207 KOps/s | $\color{#d91a1a}-0.85\\%$ | | test_unbind_speed_stack0 | 0.4420ms | 0.3139ms | 3.1862 KOps/s | 3.2725 KOps/s | $\color{#d91a1a}-2.64\\%$ | | test_unbind_speed_stack1 | 92.6566ms | 0.7992ms | 1.2513 KOps/s | 1.2623 KOps/s | $\color{#d91a1a}-0.87\\%$ | | test_split | 93.7002ms | 2.3829ms | 419.6485 Ops/s | 427.3976 Ops/s | $\color{#d91a1a}-1.81\\%$ | | test_chunk | 94.3853ms | 2.3798ms | 420.2047 Ops/s | 423.4569 Ops/s | $\color{#d91a1a}-0.77\\%$ | | test_creation[device0] | 0.1594ms | 0.1059ms | 9.4403 KOps/s | 9.3780 KOps/s | $\color{#35bf28}+0.66\\%$ | | test_creation_from_tensor | 0.1619ms | 0.1039ms | 9.6211 KOps/s | 9.6831 KOps/s | $\color{#d91a1a}-0.64\\%$ | | test_add_one[memmap_tensor0] | 63.9820μs | 9.1656μs | 109.1035 KOps/s | 106.8955 KOps/s | $\color{#35bf28}+2.07\\%$ | | test_contiguous[memmap_tensor0] | 27.0510μs | 2.2575μs | 442.9587 KOps/s | 445.6562 KOps/s | $\color{#d91a1a}-0.61\\%$ | | test_stack[memmap_tensor0] | 35.7310μs | 6.9370μs | 144.1547 KOps/s | 138.4374 KOps/s | $\color{#35bf28}+4.13\\%$ | | test_memmaptd_index | 1.2174ms | 0.4601ms | 2.1735 KOps/s | 2.2889 KOps/s | $\textbf{\color{#d91a1a}-5.04\\%}$ | | test_memmaptd_index_astensor | 0.8364ms | 0.5267ms | 1.8985 KOps/s | 1.9674 KOps/s | $\color{#d91a1a}-3.50\\%$ | | test_memmaptd_index_op | 1.5229ms | 1.1200ms | 892.8432 Ops/s | 942.1066 Ops/s | $\textbf{\color{#d91a1a}-5.23\\%}$ | | test_serialize_model | 94.9215ms | 90.4087ms | 11.0609 Ops/s | 10.8105 Ops/s | $\color{#35bf28}+2.32\\%$ | | test_serialize_model_pickle | 1.3462s | 1.2361s | 0.8090 Ops/s | 0.8081 Ops/s | $\color{#35bf28}+0.11\\%$ | | test_serialize_weights | 0.1841s | 96.7575ms | 10.3351 Ops/s | 11.0799 Ops/s | $\textbf{\color{#d91a1a}-6.72\\%}$ | | test_serialize_weights_returnearly | 0.2837s | 68.6986ms | 14.5563 Ops/s | 16.0928 Ops/s | $\textbf{\color{#d91a1a}-9.55\\%}$ | | test_serialize_weights_pickle | 1.3541s | 1.2370s | 0.8084 Ops/s | 0.8087 Ops/s | $\color{#d91a1a}-0.03\\%$ | | test_reshape_pytree | 83.0620μs | 39.6089μs | 25.2468 KOps/s | 26.1213 KOps/s | $\color{#d91a1a}-3.35\\%$ | | test_reshape_td | 0.1937ms | 46.9909μs | 21.2807 KOps/s | 22.9986 KOps/s | $\textbf{\color{#d91a1a}-7.47\\%}$ | | test_view_pytree | 0.2827ms | 39.3423μs | 25.4180 KOps/s | 26.4026 KOps/s | $\color{#d91a1a}-3.73\\%$ | | test_view_td | 0.2221ms | 53.5373μs | 18.6786 KOps/s | 20.2940 KOps/s | $\textbf{\color{#d91a1a}-7.96\\%}$ | | test_unbind_pytree | 0.2370ms | 38.5058μs | 25.9701 KOps/s | 26.1200 KOps/s | $\color{#d91a1a}-0.57\\%$ | | test_unbind_td | 0.4255ms | 46.9691μs | 21.2906 KOps/s | 21.5006 KOps/s | $\color{#d91a1a}-0.98\\%$ | | test_split_pytree | 0.4302ms | 51.8577μs | 19.2836 KOps/s | 19.3587 KOps/s | $\color{#d91a1a}-0.39\\%$ | | test_split_td | 0.1678ms | 60.9413μs | 16.4092 KOps/s | 13.8248 KOps/s | $\textbf{\color{#35bf28}+18.69\\%}$ | | test_add_pytree | 0.2675ms | 61.0677μs | 16.3753 KOps/s | 16.2072 KOps/s | $\color{#35bf28}+1.04\\%$ | | test_add_td | 0.2973ms | 99.6668μs | 10.0334 KOps/s | 9.8914 KOps/s | $\color{#35bf28}+1.44\\%$ | | test_compile_add_one_nested[tensordict-compile] | 0.4145ms | 0.2117ms | 4.7233 KOps/s | 4.6211 KOps/s | $\color{#35bf28}+2.21\\%$ | | test_compile_add_one_nested[tensordict-eager] | 0.3059ms | 0.1750ms | 5.7130 KOps/s | 5.7853 KOps/s | $\color{#d91a1a}-1.25\\%$ | | test_compile_add_one_nested[pytree-compile] | 0.2995ms | 0.1499ms | 6.6713 KOps/s | 6.6800 KOps/s | $\color{#d91a1a}-0.13\\%$ | | test_compile_add_one_nested[pytree-eager] | 0.2396ms | 0.2007ms | 4.9815 KOps/s | 5.0147 KOps/s | $\color{#d91a1a}-0.66\\%$ | | test_compile_copy_nested[tensordict-compile] | 99.0220μs | 22.6909μs | 44.0705 KOps/s | 44.7633 KOps/s | $\color{#d91a1a}-1.55\\%$ | | test_compile_copy_nested[tensordict-eager] | 70.3120μs | 49.0496μs | 20.3875 KOps/s | 20.6258 KOps/s | $\color{#d91a1a}-1.16\\%$ | | test_compile_copy_nested[pytree-compile] | 0.1002ms | 74.0626μs | 13.5021 KOps/s | 13.6681 KOps/s | $\color{#d91a1a}-1.21\\%$ | | test_compile_copy_nested[pytree-eager] | 78.4020μs | 60.3946μs | 16.5578 KOps/s | 16.9700 KOps/s | $\color{#d91a1a}-2.43\\%$ | | test_compile_add_one_flat[tensordict-compile] | 0.5771ms | 0.3446ms | 2.9017 KOps/s | 2.9994 KOps/s | $\color{#d91a1a}-3.26\\%$ | | test_compile_add_one_flat[tensordict-eager] | 0.4429ms | 0.2264ms | 4.4178 KOps/s | 4.5024 KOps/s | $\color{#d91a1a}-1.88\\%$ | | test_compile_add_one_flat[tensorclass-compile] | 0.3831ms | 0.1346ms | 7.4272 KOps/s | 7.2507 KOps/s | $\color{#35bf28}+2.43\\%$ | | test_compile_add_one_flat[tensorclass-eager] | 0.2740ms | 63.3577μs | 15.7834 KOps/s | 15.9479 KOps/s | $\color{#d91a1a}-1.03\\%$ | | test_compile_add_one_flat[pytree-compile] | 0.3901ms | 0.3346ms | 2.9889 KOps/s | 3.0008 KOps/s | $\color{#d91a1a}-0.40\\%$ | | test_compile_add_one_flat[pytree-eager] | 0.8790ms | 0.6557ms | 1.5252 KOps/s | 1.5253 KOps/s | $-0.01\\%$ | | test_compile_add_self_flat[tensordict-eager] | 0.4716ms | 0.2765ms | 3.6172 KOps/s | 3.6446 KOps/s | $\color{#d91a1a}-0.75\\%$ | | test_compile_add_self_flat[tensordict-compile] | 0.5899ms | 0.3363ms | 2.9737 KOps/s | 2.9753 KOps/s | $\color{#d91a1a}-0.05\\%$ | | test_compile_add_self_flat[tensorclass-eager] | 0.3025ms | 76.5323μs | 13.0664 KOps/s | 13.1805 KOps/s | $\color{#d91a1a}-0.87\\%$ | | test_compile_add_self_flat[tensorclass-compile] | 0.3866ms | 0.1346ms | 7.4280 KOps/s | 7.4102 KOps/s | $\color{#35bf28}+0.24\\%$ | | test_compile_add_self_flat[pytree-eager] | 0.7863ms | 0.5666ms | 1.7648 KOps/s | 1.8009 KOps/s | $\color{#d91a1a}-2.01\\%$ | | test_compile_add_self_flat[pytree-compile] | 0.4093ms | 0.3319ms | 3.0132 KOps/s | 2.9650 KOps/s | $\color{#35bf28}+1.63\\%$ | | test_compile_copy_flat[tensordict-compile] | 0.2400ms | 19.0702μs | 52.4378 KOps/s | 49.3513 KOps/s | $\textbf{\color{#35bf28}+6.25\\%}$ | | test_compile_copy_flat[tensordict-eager] | 0.2412ms | 32.4646μs | 30.8028 KOps/s | 31.3751 KOps/s | $\color{#d91a1a}-1.82\\%$ | | test_compile_copy_flat[pytree-compile] | 0.1053ms | 76.8290μs | 13.0159 KOps/s | 12.9826 KOps/s | $\color{#35bf28}+0.26\\%$ | | test_compile_copy_flat[pytree-eager] | 0.2922ms | 60.7031μs | 16.4736 KOps/s | 16.4022 KOps/s | $\color{#35bf28}+0.44\\%$ | | test_compile_assign_and_add[tensordict-compile] | 2.6795ms | 0.9648ms | 1.0365 KOps/s | 1.0578 KOps/s | $\color{#d91a1a}-2.01\\%$ | | test_compile_assign_and_add[tensordict-eager] | 4.0509ms | 3.5433ms | 282.2223 Ops/s | 289.1024 Ops/s | $\color{#d91a1a}-2.38\\%$ | | test_compile_assign_and_add[pytree-compile] | 2.6211ms | 0.9450ms | 1.0582 KOps/s | 1.0617 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_compile_assign_and_add[pytree-eager] | 3.6143ms | 3.3960ms | 294.4650 Ops/s | 291.2900 Ops/s | $\color{#35bf28}+1.09\\%$ | | test_compile_indexing[tensor-tensordict-compile] | 0.1545ms | 0.1139ms | 8.7813 KOps/s | 8.7177 KOps/s | $\color{#35bf28}+0.73\\%$ | | test_compile_indexing[tensor-tensordict-eager] | 0.2381ms | 67.5531μs | 14.8032 KOps/s | 15.5813 KOps/s | $\color{#d91a1a}-4.99\\%$ | | test_compile_indexing[tensor-tensorclass-compile] | 0.1618ms | 0.1070ms | 9.3437 KOps/s | 9.3274 KOps/s | $\color{#35bf28}+0.18\\%$ | | test_compile_indexing[tensor-tensorclass-eager] | 0.2026ms | 48.7018μs | 20.5331 KOps/s | 21.3225 KOps/s | $\color{#d91a1a}-3.70\\%$ | | test_compile_indexing[tensor-pytree-compile] | 0.2587ms | 0.1115ms | 8.9721 KOps/s | 9.3828 KOps/s | $\color{#d91a1a}-4.38\\%$ | | test_compile_indexing[tensor-pytree-eager] | 85.6220μs | 50.2480μs | 19.9013 KOps/s | 21.3960 KOps/s | $\textbf{\color{#d91a1a}-6.99\\%}$ | | test_compile_indexing[slice-tensordict-compile] | 0.2261ms | 0.1442ms | 6.9337 KOps/s | 6.9188 KOps/s | $\color{#35bf28}+0.21\\%$ | | test_compile_indexing[slice-tensordict-eager] | 0.1965ms | 28.1990μs | 35.4623 KOps/s | 35.9425 KOps/s | $\color{#d91a1a}-1.34\\%$ | | test_compile_indexing[slice-tensorclass-compile] | 0.1677ms | 0.1363ms | 7.3355 KOps/s | 7.3270 KOps/s | $\color{#35bf28}+0.12\\%$ | | test_compile_indexing[slice-tensorclass-eager] | 89.1510μs | 24.4212μs | 40.9481 KOps/s | 43.4902 KOps/s | $\textbf{\color{#d91a1a}-5.85\\%}$ | | test_compile_indexing[slice-pytree-compile] | 0.3083ms | 0.1402ms | 7.1339 KOps/s | 7.3480 KOps/s | $\color{#d91a1a}-2.91\\%$ | | test_compile_indexing[slice-pytree-eager] | 64.4110μs | 24.8312μs | 40.2718 KOps/s | 43.0425 KOps/s | $\textbf{\color{#d91a1a}-6.44\\%}$ | | test_compile_indexing[int-tensordict-compile] | 0.2515ms | 0.1438ms | 6.9549 KOps/s | 6.8884 KOps/s | $\color{#35bf28}+0.97\\%$ | | test_compile_indexing[int-tensordict-eager] | 0.4566ms | 27.5517μs | 36.2954 KOps/s | 36.8052 KOps/s | $\color{#d91a1a}-1.39\\%$ | | test_compile_indexing[int-tensorclass-compile] | 0.2769ms | 0.1363ms | 7.3354 KOps/s | 7.2411 KOps/s | $\color{#35bf28}+1.30\\%$ | | test_compile_indexing[int-tensorclass-eager] | 0.1814ms | 25.0005μs | 39.9991 KOps/s | 43.6139 KOps/s | $\textbf{\color{#d91a1a}-8.29\\%}$ | | test_compile_indexing[int-pytree-compile] | 0.2553ms | 0.1412ms | 7.0843 KOps/s | 7.3052 KOps/s | $\color{#d91a1a}-3.02\\%$ | | test_compile_indexing[int-pytree-eager] | 0.3962ms | 25.0935μs | 39.8509 KOps/s | 43.1191 KOps/s | $\textbf{\color{#d91a1a}-7.58\\%}$ | | test_mod_add[eager] | 0.1800ms | 40.3318μs | 24.7943 KOps/s | 26.2461 KOps/s | $\textbf{\color{#d91a1a}-5.53\\%}$ | | test_mod_add[compile] | 0.1329ms | 73.3362μs | 13.6358 KOps/s | 14.3507 KOps/s | $\color{#d91a1a}-4.98\\%$ | | test_mod_add[compile-overhead] | 0.2605ms | 0.1516ms | 6.5948 KOps/s | 6.4581 KOps/s | $\color{#35bf28}+2.12\\%$ | | test_mod_wrap[eager] | 0.4104ms | 0.2597ms | 3.8502 KOps/s | 3.8366 KOps/s | $\color{#35bf28}+0.35\\%$ | | test_mod_wrap[compile] | 0.4407ms | 0.2978ms | 3.3578 KOps/s | 3.3735 KOps/s | $\color{#d91a1a}-0.47\\%$ | | test_mod_wrap[compile-overhead] | 7.9440ms | 4.2875ms | 233.2369 Ops/s | 229.6982 Ops/s | $\color{#35bf28}+1.54\\%$ | | test_mod_wrap_and_backward[eager] | 1.6057ms | 1.4737ms | 678.5598 Ops/s | 673.4567 Ops/s | $\color{#35bf28}+0.76\\%$ | | test_mod_wrap_and_backward[compile] | 1.6689ms | 1.4841ms | 673.8315 Ops/s | 682.0745 Ops/s | $\color{#d91a1a}-1.21\\%$ | | test_mod_wrap_and_backward[compile-overhead] | 1.4827ms | 1.0141ms | 986.1406 Ops/s | 989.4755 Ops/s | $\color{#d91a1a}-0.34\\%$ | | test_seq_add[eager] | 0.2326ms | 0.1158ms | 8.6348 KOps/s | 8.9957 KOps/s | $\color{#d91a1a}-4.01\\%$ | | test_seq_add[compile] | 0.1322ms | 88.4712μs | 11.3031 KOps/s | 11.6262 KOps/s | $\color{#d91a1a}-2.78\\%$ | | test_seq_add[compile-overhead] | 0.2868ms | 0.1279ms | 7.8178 KOps/s | 8.0018 KOps/s | $\color{#d91a1a}-2.30\\%$ | | test_seq_wrap[eager] | 0.5147ms | 0.4347ms | 2.3007 KOps/s | 2.3500 KOps/s | $\color{#d91a1a}-2.10\\%$ | | test_seq_wrap[compile] | 0.4742ms | 0.3292ms | 3.0375 KOps/s | 3.0319 KOps/s | $\color{#35bf28}+0.18\\%$ | | test_seq_wrap[compile-overhead] | 0.3138s | 0.1496s | 6.6849 Ops/s | 6.6567 Ops/s | $\color{#35bf28}+0.42\\%$ | | test_func_call_runtime[False-eager] | 0.9969ms | 0.8070ms | 1.2392 KOps/s | 1.3000 KOps/s | $\color{#d91a1a}-4.67\\%$ | | test_func_call_runtime[False-compile] | 0.9607ms | 0.8265ms | 1.2100 KOps/s | 1.1980 KOps/s | $\color{#35bf28}+1.00\\%$ | | test_func_call_runtime[False-compile-overhead] | 0.4097ms | 0.3720ms | 2.6884 KOps/s | 2.6920 KOps/s | $\color{#d91a1a}-0.13\\%$ | | test_func_call_runtime[True-eager] | 1.0766ms | 0.9635ms | 1.0379 KOps/s | 1.0285 KOps/s | $\color{#35bf28}+0.92\\%$ | | test_func_call_runtime[True-compile] | 1.0058ms | 0.8680ms | 1.1520 KOps/s | 1.1515 KOps/s | $\color{#35bf28}+0.04\\%$ | | test_func_call_runtime[True-compile-overhead] | 0.5550ms | 0.4146ms | 2.4121 KOps/s | 2.4313 KOps/s | $\color{#d91a1a}-0.79\\%$ | | test_func_call_cm_runtime[False-eager] | 0.9153ms | 0.8022ms | 1.2466 KOps/s | 1.2956 KOps/s | $\color{#d91a1a}-3.78\\%$ | | test_func_call_cm_runtime[False-compile] | 0.9546ms | 0.8279ms | 1.2079 KOps/s | 1.2176 KOps/s | $\color{#d91a1a}-0.80\\%$ | | test_func_call_cm_runtime[False-compile-overhead] | 0.4183ms | 0.3713ms | 2.6935 KOps/s | 2.6922 KOps/s | $\color{#35bf28}+0.05\\%$ | | test_func_call_cm_runtime[True-eager] | 1.2071ms | 1.0728ms | 932.1385 Ops/s | 922.5362 Ops/s | $\color{#35bf28}+1.04\\%$ | | test_func_call_cm_runtime[True-compile] | 1.2232ms | 1.0413ms | 960.3523 Ops/s | 951.1821 Ops/s | $\color{#35bf28}+0.96\\%$ | | test_func_call_cm_runtime[True-compile-overhead] | 1.1201ms | 1.0397ms | 961.7834 Ops/s | 954.5928 Ops/s | $\color{#35bf28}+0.75\\%$ | | test_distributed | 0.2257ms | 72.1670μs | 13.8568 KOps/s | 14.3818 KOps/s | $\color{#d91a1a}-3.65\\%$ | | test_tdmodule | 89.4820μs | 16.9856μs | 58.8735 KOps/s | 65.3118 KOps/s | $\textbf{\color{#d91a1a}-9.86\\%}$ | | test_tdmodule_dispatch | 50.7910μs | 34.0445μs | 29.3733 KOps/s | 32.1742 KOps/s | $\textbf{\color{#d91a1a}-8.71\\%}$ | | test_tdseq | 39.5210μs | 17.7789μs | 56.2464 KOps/s | 63.1888 KOps/s | $\textbf{\color{#d91a1a}-10.99\\%}$ | | test_tdseq_dispatch | 60.4410μs | 36.6657μs | 27.2734 KOps/s | 29.6432 KOps/s | $\textbf{\color{#d91a1a}-7.99\\%}$ | | test_instantiation_functorch | 2.1361ms | 2.0621ms | 484.9343 Ops/s | 481.6338 Ops/s | $\color{#35bf28}+0.69\\%$ | | test_instantiation_td | 2.0709ms | 1.3296ms | 752.1213 Ops/s | 753.1117 Ops/s | $\color{#d91a1a}-0.13\\%$ | | test_exec_functorch | 0.2753ms | 0.2372ms | 4.2152 KOps/s | 4.3363 KOps/s | $\color{#d91a1a}-2.79\\%$ | | test_exec_functional_call | 0.3961ms | 0.2364ms | 4.2297 KOps/s | 4.4028 KOps/s | $\color{#d91a1a}-3.93\\%$ | | test_exec_td | 0.3381ms | 0.2285ms | 4.3773 KOps/s | 4.4601 KOps/s | $\color{#d91a1a}-1.86\\%$ | | test_exec_td_decorator | 0.4004ms | 0.2818ms | 3.5485 KOps/s | 3.5455 KOps/s | $\color{#35bf28}+0.09\\%$ | | test_vmap_mlp_speed[True-True] | 1.0580ms | 0.6920ms | 1.4451 KOps/s | 1.4755 KOps/s | $\color{#d91a1a}-2.06\\%$ | | test_vmap_mlp_speed[True-False] | 0.8136ms | 0.6787ms | 1.4735 KOps/s | 1.4861 KOps/s | $\color{#d91a1a}-0.85\\%$ | | test_vmap_mlp_speed[False-True] | 0.7511ms | 0.6131ms | 1.6311 KOps/s | 1.6795 KOps/s | $\color{#d91a1a}-2.88\\%$ | | test_vmap_mlp_speed[False-False] | 0.7649ms | 0.5933ms | 1.6856 KOps/s | 1.6821 KOps/s | $\color{#35bf28}+0.21\\%$ | | test_vmap_mlp_speed_decorator[True-True] | 1.3520ms | 0.7269ms | 1.3757 KOps/s | 1.3786 KOps/s | $\color{#d91a1a}-0.21\\%$ | | test_vmap_mlp_speed_decorator[True-False] | 0.8641ms | 0.7270ms | 1.3756 KOps/s | 1.3756 KOps/s | $-0.00\\%$ | | test_vmap_mlp_speed_decorator[False-True] | 0.8343ms | 0.6528ms | 1.5318 KOps/s | 1.5671 KOps/s | $\color{#d91a1a}-2.25\\%$ | | test_vmap_mlp_speed_decorator[False-False] | 0.8113ms | 0.6586ms | 1.5183 KOps/s | 1.5782 KOps/s | $\color{#d91a1a}-3.79\\%$ | | test_vmap_transformer_speed[True-True] | 9.1859ms | 8.9206ms | 112.1002 Ops/s | 111.7117 Ops/s | $\color{#35bf28}+0.35\\%$ | | test_vmap_transformer_speed[True-False] | 9.0831ms | 8.8828ms | 112.5765 Ops/s | 112.3512 Ops/s | $\color{#35bf28}+0.20\\%$ | | test_vmap_transformer_speed[False-True] | 9.0403ms | 8.8256ms | 113.3069 Ops/s | 113.0581 Ops/s | $\color{#35bf28}+0.22\\%$ | | test_vmap_transformer_speed[False-False] | 8.9713ms | 8.8278ms | 113.2785 Ops/s | 112.8804 Ops/s | $\color{#35bf28}+0.35\\%$ | | test_vmap_transformer_speed_decorator[True-True] | 21.6132ms | 20.8564ms | 47.9468 Ops/s | 47.7950 Ops/s | $\color{#35bf28}+0.32\\%$ | | test_vmap_transformer_speed_decorator[True-False] | 21.0483ms | 20.7827ms | 48.1169 Ops/s | 48.1303 Ops/s | $\color{#d91a1a}-0.03\\%$ | | test_vmap_transformer_speed_decorator[False-True] | 21.3546ms | 20.6360ms | 48.4589 Ops/s | 48.3758 Ops/s | $\color{#35bf28}+0.17\\%$ | | test_vmap_transformer_speed_decorator[False-False] | 21.6746ms | 20.6672ms | 48.3858 Ops/s | 48.4594 Ops/s | $\color{#d91a1a}-0.15\\%$ | | test_to_module_speed[True] | 1.6901ms | 1.1525ms | 867.6842 Ops/s | 872.8804 Ops/s | $\color{#d91a1a}-0.60\\%$ | | test_to_module_speed[False] | 1.5051ms | 1.1092ms | 901.5740 Ops/s | 890.2963 Ops/s | $\color{#35bf28}+1.27\\%$ | | test_tc_init | 82.8720μs | 40.1419μs | 24.9116 KOps/s | 25.1324 KOps/s | $\color{#d91a1a}-0.88\\%$ | | test_tc_init_nested | 0.1518ms | 78.9593μs | 12.6647 KOps/s | 12.7398 KOps/s | $\color{#d91a1a}-0.59\\%$ | | test_tc_first_layer_tensor | 3.7800μs | 0.8157μs | 1.2259 MOps/s | 1.2720 MOps/s | $\color{#d91a1a}-3.62\\%$ | | test_tc_first_layer_nontensor | 22.0400μs | 2.5571μs | 391.0664 KOps/s | 389.2335 KOps/s | $\color{#35bf28}+0.47\\%$ | | test_tc_second_layer_tensor | 35.6840μs | 1.6268μs | 614.6987 KOps/s | 626.6775 KOps/s | $\color{#d91a1a}-1.91\\%$ | | test_tc_second_layer_nontensor | 86.5510μs | 3.4146μs | 292.8579 KOps/s | 292.8123 KOps/s | $\color{#35bf28}+0.02\\%$ | | test_unbind | 0.3382s | 12.8288ms | 77.9496 Ops/s | 77.6283 Ops/s | $\color{#35bf28}+0.41\\%$ | | test_full_like | 0.7883ms | 0.5792ms | 1.7265 KOps/s | 1.7322 KOps/s | $\color{#d91a1a}-0.33\\%$ | | test_zeros_like | 0.2688ms | 0.1976ms | 5.0601 KOps/s | 5.0570 KOps/s | $\color{#35bf28}+0.06\\%$ | | test_ones_like | 0.3772ms | 0.1975ms | 5.0639 KOps/s | 5.0606 KOps/s | $\color{#35bf28}+0.07\\%$ | | test_clone | 0.5844ms | 0.4143ms | 2.4137 KOps/s | 2.4145 KOps/s | $\color{#d91a1a}-0.03\\%$ | | test_squeeze | 30.4910μs | 11.0302μs | 90.6604 KOps/s | 88.7785 KOps/s | $\color{#35bf28}+2.12\\%$ | | test_unsqueeze | 0.2465ms | 79.9339μs | 12.5103 KOps/s | 12.3375 KOps/s | $\color{#35bf28}+1.40\\%$ | | test_split | 0.4485ms | 0.1811ms | 5.5209 KOps/s | 5.6018 KOps/s | $\color{#d91a1a}-1.44\\%$ | | test_permute | 0.3311ms | 0.1964ms | 5.0916 KOps/s | 5.2023 KOps/s | $\color{#d91a1a}-2.13\\%$ | | test_stack | 1.2497ms | 0.9187ms | 1.0885 KOps/s | 1.1123 KOps/s | $\color{#d91a1a}-2.14\\%$ | | test_cat | 1.3680ms | 1.2316ms | 811.9460 Ops/s | 812.0742 Ops/s | $\color{#d91a1a}-0.02\\%$ |