dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Function cost set to 1 #75

Closed chenyan-dfinity closed 1 year ago

chenyan-dfinity commented 1 year ago
github-actions[bot] commented 1 year ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 133_747 ($\textcolor{green}{-0.06\%}$) 7_926_149_584 ($\textcolor{red}{13.88\%}$) 61_987_732 325_829 ($\textcolor{red}{13.34\%}$) 6_271_119_292 ($\textcolor{red}{13.69\%}$) 350_741 ($\textcolor{red}{13.52\%}$)
triemap 135_152 ($\textcolor{green}{-0.12\%}$) 13_189_507_728 ($\textcolor{red}{15.38\%}$) 74_216_052 259_251 ($\textcolor{red}{16.38\%}$) 631_012 ($\textcolor{red}{15.22\%}$) 620_421 ($\textcolor{red}{15.11\%}$)
rbtree 135_826 ($\textcolor{green}{-0.21\%}$) 6_809_377_524 ($\textcolor{red}{13.88\%}$) 57_995_940 101_646 ($\textcolor{red}{14.34\%}$) 304_897 ($\textcolor{red}{13.53\%}$) 317_953 ($\textcolor{red}{14.23\%}$)
splay 131_671 ($\textcolor{green}{-0.15\%}$) 13_700_785_810 ($\textcolor{red}{18.43\%}$) 53_995_876 654_897 ($\textcolor{red}{18.66\%}$) 689_828 ($\textcolor{red}{18.60\%}$) 958_656 ($\textcolor{red}{18.32\%}$)
btree 176_191 ($\textcolor{green}{-0.15\%}$) 9_543_512_697 ($\textcolor{red}{16.04\%}$) 31_103_892 325_152 ($\textcolor{red}{17.16\%}$) 447_533 ($\textcolor{red}{16.49\%}$) 498_944 ($\textcolor{red}{16.29\%}$)
zhenya_hashmap 141_675 ($\textcolor{green}{-0.02\%}$) 3_049_829_583 ($\textcolor{red}{15.83\%}$) 65_987_480 77_261 ($\textcolor{red}{18.25\%}$) 94_211 ($\textcolor{red}{17.54\%}$) 110_378 ($\textcolor{red}{16.48\%}$)
btreemap_rs 419_497 ($\textcolor{red}{1.46\%}$) 1_786_796_287 ($\textcolor{red}{8.31\%}$) 13_762_560 74_944 ($\textcolor{red}{12.17\%}$) 124_024 ($\textcolor{red}{10.48\%}$) 90_045 ($\textcolor{red}{10.81\%}$)
imrc_hashmap_rs 419_185 ($\textcolor{red}{1.35\%}$) 2_553_471_915 ($\textcolor{red}{7.03\%}$) 122_454_016 37_685 ($\textcolor{red}{14.73\%}$) 176_853 ($\textcolor{red}{8.69\%}$) 110_291 ($\textcolor{red}{11.98\%}$)
hashmap_rs 412_968 ($\textcolor{red}{1.69\%}$) 428_765_280 ($\textcolor{red}{9.21\%}$) 36_536_320 21_020 ($\textcolor{red}{27.41\%}$) 26_013 ($\textcolor{red}{24.68\%}$) 24_296 ($\textcolor{red}{21.64\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 127_685 ($\textcolor{green}{-0.05\%}$) 5_564_624_316 ($\textcolor{red}{18.79\%}$) 29_995_836 608_968 ($\textcolor{red}{19.06\%}$) 223_368 ($\textcolor{red}{19.79\%}$)
heap_rs 410_604 ($\textcolor{red}{1.65\%}$) 140_169_329 ($\textcolor{red}{13.86\%}$) 9_109_504 58_401 ($\textcolor{red}{9.53\%}$) 22_633 ($\textcolor{red}{24.78\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 135_394 ($\textcolor{green}{-0.05\%}$) 2_378_438 ($\textcolor{red}{14.20\%}$) 65_508 85_362 ($\textcolor{red}{16.80\%}$) 772_471 ($\textcolor{red}{15.03\%}$) 149_862 ($\textcolor{red}{17.46\%}$)
vector 133_869 ($\textcolor{green}{-0.02\%}$) 1_955_923 ($\textcolor{red}{13.15\%}$) 24_764 139_512 ($\textcolor{red}{15.10\%}$) 187_110 ($\textcolor{red}{14.13\%}$) 181_692 ($\textcolor{red}{12.43\%}$)
vec_rs 409_399 ($\textcolor{red}{1.67\%}$) 288_724 ($\textcolor{red}{8.58\%}$) 655_360 16_506 ($\textcolor{red}{28.71\%}$) 29_924 ($\textcolor{red}{18.50\%}$) 25_060 ($\textcolor{red}{19.24\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 170_102 ($\textcolor{green}{-0.01\%}$) 280_779_470 ($\textcolor{red}{6.29\%}$) 252_128_803 ($\textcolor{red}{7.24\%}$) 40_087 ($\textcolor{red}{14.06\%}$) 27_536 ($\textcolor{red}{18.43\%}$)
Rust 497_656 ($\textcolor{red}{1.38\%}$) 82_786_613 ($\textcolor{red}{0.33\%}$) 56_791_668 ($\textcolor{red}{0.47\%}$) 48_173 ($\textcolor{red}{13.62\%}$) 50_633 ($\textcolor{red}{21.72\%}$)

Certified map

binary_size generate 10k max mem inc witness
Motoko 162_004 ($\textcolor{green}{-0.25\%}$) 21_070_785_212 ($\textcolor{red}{13.41\%}$) 3_429_924 2_506_853 ($\textcolor{red}{13.47\%}$) 373_231 ($\textcolor{red}{13.87\%}$)
Rust 441_221 ($\textcolor{red}{1.70\%}$) 6_289_617_411 ($\textcolor{red}{1.33\%}$) 1_081_344 1_000_559 ($\textcolor{red}{1.60\%}$) 298_733 ($\textcolor{red}{3.43\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 225_837 ($\textcolor{red}{0.01\%}$) 45_603 ($\textcolor{red}{21.63\%}$) 21_407 ($\textcolor{red}{31.20\%}$) 17_883 ($\textcolor{red}{41.32\%}$) 18_884 ($\textcolor{red}{33.89\%}$)
Rust 717_562 ($\textcolor{red}{1.80\%}$) 513_687 ($\textcolor{red}{8.85\%}$) 98_015 ($\textcolor{red}{13.28\%}$) 120_084 ($\textcolor{red}{14.78\%}$) 130_769 ($\textcolor{red}{12.96\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 183_957 ($\textcolor{red}{0.04\%}$) 17_062 ($\textcolor{red}{40.07\%}$) 28_552 ($\textcolor{red}{27.93\%}$) 8_523 ($\textcolor{red}{80.96\%}$)
Rust 777_492 ($\textcolor{red}{1.41\%}$) 137_171 ($\textcolor{red}{9.71\%}$) 355_361 ($\textcolor{red}{9.52\%}$) 87_938 ($\textcolor{red}{14.03\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 118_895 ($\textcolor{green}{-0.01\%}$) 22_897 ($\textcolor{red}{209.75\%}$)
Rust 23_571 ($\textcolor{green}{-0.54\%}$) 1_141 ($\textcolor{red}{44.25\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 125_138 ($\textcolor{green}{-0.02\%}$) 50_631 ($\textcolor{red}{232.92\%}$) 4_468 ($\textcolor{red}{166.11\%}$)
Rust 442_744 ($\textcolor{red}{1.82\%}$) 66_012 ($\textcolor{red}{51.61\%}$) 10_664 ($\textcolor{red}{38.80\%}$)

Statistics

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_215_806_074 ($\textcolor{red}{20.11\%}$) 59_396_776 114 ($\textcolor{red}{128.00\%}$) 114 ($\textcolor{red}{128.00\%}$) 114 ($\textcolor{red}{128.00\%}$)
copying 1_215_805_960 ($\textcolor{red}{20.11\%}$) 59_396_776 1_215_491_834 ($\textcolor{red}{20.08\%}$) 1_215_572_444 ($\textcolor{red}{20.08\%}$) 1_215_496_957 ($\textcolor{red}{20.08\%}$)
compacting 1_834_307_601 ($\textcolor{red}{9.51\%}$) 59_396_776 1_411_814_059 ($\textcolor{red}{9.19\%}$) 1_680_860_803 ($\textcolor{red}{9.70\%}$) 1_710_256_771 ($\textcolor{red}{9.74\%}$)
generational 2_794_228_637 ($\textcolor{red}{11.01\%}$) 59_405_240 1_094_958_722 ($\textcolor{red}{12.01\%}$) 1_174_513 ($\textcolor{red}{11.56\%}$) 1_078_927 ($\textcolor{red}{11.53\%}$)
incremental 33_436_361 ($\textcolor{red}{3.45\%}$) 1_136_153_832 316_255_910 ($\textcolor{red}{8.96\%}$) 319_190_456 ($\textcolor{red}{8.96\%}$) 319_220_154 ($\textcolor{red}{8.96\%}$)

Actor class

binary size put new bucket put existing bucket get
Map 254_118 ($\textcolor{red}{0.02\%}$) 696_682 ($\textcolor{red}{9.09\%}$) 15_782 ($\textcolor{red}{254.73\%}$) 16_306 ($\textcolor{red}{232.17\%}$)

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 139_889 ($\textcolor{red}{0.00\%}$) 126_828 ($\textcolor{red}{0.00\%}$) 28_134 ($\textcolor{red}{92.16\%}$) 11_496 ($\textcolor{red}{36.03\%}$) 22_423 ($\textcolor{red}{112.94\%}$) 6_201 ($\textcolor{red}{69.33\%}$)
Rust 478_999 ($\textcolor{red}{1.45\%}$) 529_026 ($\textcolor{red}{1.75\%}$) 67_722 ($\textcolor{red}{31.27\%}$) 41_362 ($\textcolor{red}{19.33\%}$) 90_152 ($\textcolor{red}{21.55\%}$) 50_438 ($\textcolor{red}{21.20\%}$)

Statistics

github-actions[bot] commented 1 year ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 133_747 7_926_149_584 61_987_732 325_829 6_271_119_292 350_741
triemap 135_152 13_189_507_728 74_216_052 259_251 631_012 620_421
rbtree 135_826 6_809_377_524 57_995_940 101_646 304_897 317_953
splay 131_671 13_700_785_810 53_995_876 654_897 689_828 958_656
btree 176_191 9_543_512_697 31_103_892 325_152 447_533 498_944
zhenya_hashmap 141_675 3_049_829_583 65_987_480 77_261 94_211 110_378
btreemap_rs 419_497 1_786_796_287 13_762_560 74_944 124_024 90_045
imrc_hashmap_rs 419_185 2_553_471_915 122_454_016 37_685 176_853 110_291
hashmap_rs 412_968 428_765_280 36_536_320 21_020 26_013 24_296

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 127_685 5_564_624_316 29_995_836 608_968 223_368 580_182
heap_rs 410_604 140_169_329 9_109_504 58_401 22_633 58_632

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 135_394 2_378_438 65_508 85_362 772_471 149_862
vector 133_869 1_955_923 24_764 139_512 187_110 181_692
vec_rs 409_399 288_724 655_360 16_506 29_924 25_060

Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 170_102 280_779_470 252_128_803 40_087 27_536
Rust 497_656 82_786_613 56_791_668 48_173 50_633

Certified map

binary_size generate 10k max mem inc witness
Motoko 162_004 21_070_785_212 3_429_924 2_506_853 373_231
Rust 441_221 6_289_617_411 1_081_344 1_000_559 298_733

Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 225_837 45_603 21_407 17_883 18_884
Rust 717_562 513_687 98_015 120_084 130_769

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 183_957 17_062 28_552 8_523
Rust 777_492 137_171 355_361 87_938

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 118_895 22_897
Rust 23_571 1_141

Timer

binary_size setTimer cancelTimer
Motoko 125_138 50_631 4_468
Rust 442_744 66_012 10_664

Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_215_806_074 59_396_776 114 114 114
copying 1_215_805_960 59_396_776 1_215_491_834 1_215_572_444 1_215_496_957
compacting 1_834_307_601 59_396_776 1_411_814_059 1_680_860_803 1_710_256_771
generational 2_794_228_637 59_405_240 1_094_958_722 1_174_513 1_078_927
incremental 33_436_361 1_136_153_832 316_255_910 319_190_456 319_220_154

Actor class

binary size put new bucket put existing bucket get
Map 254_118 696_682 15_782 16_306

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 139_889 126_828 28_134 11_496 22_423 6_201
Rust 478_999 529_026 67_722 41_362 90_152 50_438