dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Function cost set to 5 #76

Closed chenyan-dfinity closed 1 year ago

github-actions[bot] commented 1 year ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 133_756 ($\textcolor{green}{-0.05\%}$) 8_456_787_008 ($\textcolor{red}{21.50\%}$) 61_987_732 346_249 ($\textcolor{red}{20.45\%}$) 6_615_757_444 ($\textcolor{red}{19.94\%}$) 373_657 ($\textcolor{red}{20.94\%}$)
triemap 135_161 ($\textcolor{green}{-0.11\%}$) 14_978_845_324 ($\textcolor{red}{31.04\%}$) 74_216_052 297_735 ($\textcolor{red}{33.65\%}$) 722_224 ($\textcolor{red}{31.88\%}$) 709_041 ($\textcolor{red}{31.55\%}$)
rbtree 135_844 ($\textcolor{green}{-0.20\%}$) 7_404_914_600 ($\textcolor{red}{23.84\%}$) 57_995_940 108_390 ($\textcolor{red}{21.92\%}$) 332_553 ($\textcolor{red}{23.82\%}$) 352_845 ($\textcolor{red}{26.77\%}$)
splay 131_680 ($\textcolor{green}{-0.14\%}$) 15_209_722_290 ($\textcolor{red}{31.48\%}$) 53_995_876 729_085 ($\textcolor{red}{32.10\%}$) 768_032 ($\textcolor{red}{32.04\%}$) 1_063_884 ($\textcolor{red}{31.31\%}$)
btree 176_210 ($\textcolor{green}{-0.14\%}$) 10_885_917_477 ($\textcolor{red}{32.36\%}$) 31_103_892 369_256 ($\textcolor{red}{33.05\%}$) 512_113 ($\textcolor{red}{33.31\%}$) 570_228 ($\textcolor{red}{32.91\%}$)
zhenya_hashmap 141_692 ($\textcolor{green}{-0.01\%}$) 3_304_768_283 ($\textcolor{red}{25.51\%}$) 65_987_480 83_813 ($\textcolor{red}{28.27\%}$) 103_431 ($\textcolor{red}{29.04\%}$) 122_290 ($\textcolor{red}{29.06\%}$)
btreemap_rs 419_571 ($\textcolor{red}{1.47\%}$) 1_800_692_107 ($\textcolor{red}{9.15\%}$) 13_762_560 75_532 ($\textcolor{red}{13.05\%}$) 125_504 ($\textcolor{red}{11.79\%}$) 91_329 ($\textcolor{red}{12.39\%}$)
imrc_hashmap_rs 419_256 ($\textcolor{red}{1.37\%}$) 2_566_397_171 ($\textcolor{red}{7.57\%}$) 122_454_016 38_485 ($\textcolor{red}{17.17\%}$) 178_181 ($\textcolor{red}{9.50\%}$) 115_627 ($\textcolor{red}{17.39\%}$)
hashmap_rs 413_039 ($\textcolor{red}{1.71\%}$) 432_766_256 ($\textcolor{red}{10.23\%}$) 36_536_320 21_628 ($\textcolor{red}{31.09\%}$) 26_813 ($\textcolor{red}{28.52\%}$) 24_896 ($\textcolor{red}{24.65\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 127_692 ($\textcolor{green}{-0.04\%}$) 6_314_372_552 ($\textcolor{red}{34.79\%}$) 29_995_836 692_956 ($\textcolor{red}{35.48\%}$) 255_044 ($\textcolor{red}{36.78\%}$)
heap_rs 410_675 ($\textcolor{red}{1.67\%}$) 140_170_301 ($\textcolor{red}{13.86\%}$) 9_109_504 59_205 ($\textcolor{red}{11.04\%}$) 23_229 ($\textcolor{red}{28.07\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 135_403 ($\textcolor{green}{-0.04\%}$) 2_751_546 ($\textcolor{red}{32.12\%}$) 65_508 103_498 ($\textcolor{red}{41.61\%}$) 878_959 ($\textcolor{red}{30.89\%}$) 171_998 ($\textcolor{red}{34.81\%}$)
vector 133_879 ($\textcolor{green}{-0.02\%}$) 2_384_703 ($\textcolor{red}{37.96\%}$) 24_764 165_680 ($\textcolor{red}{36.68\%}$) 227_918 ($\textcolor{red}{39.02\%}$) 218_436 ($\textcolor{red}{35.17\%}$)
vec_rs 409_470 ($\textcolor{red}{1.69\%}$) 289_504 ($\textcolor{red}{8.88\%}$) 655_360 17_114 ($\textcolor{red}{33.45\%}$) 30_524 ($\textcolor{red}{20.87\%}$) 25_660 ($\textcolor{red}{22.10\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 170_111 ($\textcolor{green}{-0.00\%}$) 307_625_902 ($\textcolor{red}{16.46\%}$) 277_653_343 ($\textcolor{red}{18.10\%}$) 43_219 ($\textcolor{red}{22.98\%}$) 29_620 ($\textcolor{red}{27.40\%}$)
Rust 497_732 ($\textcolor{red}{1.40\%}$) 82_788_057 ($\textcolor{red}{0.33\%}$) 56_793_048 ($\textcolor{red}{0.47\%}$) 49_501 ($\textcolor{red}{16.76\%}$) 52_193 ($\textcolor{red}{25.47\%}$)

Certified map

binary_size generate 10k max mem inc witness
Motoko 162_022 ($\textcolor{green}{-0.24\%}$) 23_978_863_464 ($\textcolor{red}{29.06\%}$) 3_429_924 2_853_149 ($\textcolor{red}{29.14\%}$) 425_671 ($\textcolor{red}{29.87\%}$)
Rust 441_297 ($\textcolor{red}{1.72\%}$) 6_312_527_623 ($\textcolor{red}{1.70\%}$) 1_081_344 1_004_499 ($\textcolor{red}{2.00\%}$) 301_497 ($\textcolor{red}{4.38\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 225_863 ($\textcolor{red}{0.03\%}$) 47_763 ($\textcolor{red}{27.39\%}$) 22_599 ($\textcolor{red}{38.51\%}$) 18_949 ($\textcolor{red}{49.75\%}$) 20_115 ($\textcolor{red}{42.62\%}$)
Rust 717_659 ($\textcolor{red}{1.81\%}$) 532_119 ($\textcolor{red}{12.76\%}$) 101_647 ($\textcolor{red}{17.48\%}$) 124_976 ($\textcolor{red}{19.46\%}$) 135_521 ($\textcolor{red}{17.07\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 183_977 ($\textcolor{red}{0.05\%}$) 17_658 ($\textcolor{red}{44.96\%}$) 29_720 ($\textcolor{red}{33.16\%}$) 8_835 ($\textcolor{red}{87.58\%}$)
Rust 777_603 ($\textcolor{red}{1.42\%}$) 141_523 ($\textcolor{red}{13.19\%}$) 368_093 ($\textcolor{red}{13.44\%}$) 91_142 ($\textcolor{red}{18.19\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 118_901 ($\textcolor{green}{-0.01\%}$) 23_265 ($\textcolor{red}{214.73\%}$)
Rust 23_573 ($\textcolor{green}{-0.53\%}$) 1_165 ($\textcolor{red}{47.28\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 125_145 ($\textcolor{green}{-0.02\%}$) 51_903 ($\textcolor{red}{241.29\%}$) 4_624 ($\textcolor{red}{175.40\%}$)
Rust 442_822 ($\textcolor{red}{1.83\%}$) 67_944 ($\textcolor{red}{56.05\%}$) 11_048 ($\textcolor{red}{43.80\%}$)

Statistics

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_338_231_390 ($\textcolor{red}{32.20\%}$) 59_396_776 118 ($\textcolor{red}{136.00\%}$) 118 ($\textcolor{red}{136.00\%}$) 118 ($\textcolor{red}{136.00\%}$)
copying 1_338_231_272 ($\textcolor{red}{32.20\%}$) 59_396_776 1_337_913_554 ($\textcolor{red}{32.17\%}$) 1_338_002_356 ($\textcolor{red}{32.17\%}$) 1_337_919_129 ($\textcolor{red}{32.17\%}$)
compacting 1_911_420_593 ($\textcolor{red}{14.11\%}$) 59_396_776 1_473_824_171 ($\textcolor{red}{13.99\%}$) 1_756_485_051 ($\textcolor{red}{14.63\%}$) 1_787_369_939 ($\textcolor{red}{14.69\%}$)
generational 2_918_248_625 ($\textcolor{red}{15.94\%}$) 59_405_240 1_141_865_950 ($\textcolor{red}{16.81\%}$) 1_228_537 ($\textcolor{red}{16.69\%}$) 1_127_895 ($\textcolor{red}{16.59\%}$)
incremental 33_436_705 ($\textcolor{red}{3.45\%}$) 1_136_153_832 333_734_166 ($\textcolor{red}{14.98\%}$) 336_829_512 ($\textcolor{red}{14.98\%}$) 336_860_690 ($\textcolor{red}{14.98\%}$)

Actor class

binary size put new bucket put existing bucket get
Map 254_132 ($\textcolor{red}{0.02\%}$) 699_418 ($\textcolor{red}{9.52\%}$) 16_214 ($\textcolor{red}{264.44\%}$) 16_734 ($\textcolor{red}{240.88\%}$)

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 139_898 ($\textcolor{red}{0.01\%}$) 126_836 ($\textcolor{red}{0.01\%}$) 28_710 ($\textcolor{red}{96.09\%}$) 11_932 ($\textcolor{red}{41.19\%}$) 22_995 ($\textcolor{red}{118.38\%}$) 6_405 ($\textcolor{red}{74.90\%}$)
Rust 479_085 ($\textcolor{red}{1.47\%}$) 529_109 ($\textcolor{red}{1.77\%}$) 69_806 ($\textcolor{red}{35.31\%}$) 42_858 ($\textcolor{red}{23.65\%}$) 93_140 ($\textcolor{red}{25.58\%}$) 52_290 ($\textcolor{red}{25.65\%}$)

Statistics

github-actions[bot] commented 1 year ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 133_756 8_456_787_008 61_987_732 346_249 6_615_757_444 373_657
triemap 135_161 14_978_845_324 74_216_052 297_735 722_224 709_041
rbtree 135_844 7_404_914_600 57_995_940 108_390 332_553 352_845
splay 131_680 15_209_722_290 53_995_876 729_085 768_032 1_063_884
btree 176_210 10_885_917_477 31_103_892 369_256 512_113 570_228
zhenya_hashmap 141_692 3_304_768_283 65_987_480 83_813 103_431 122_290
btreemap_rs 419_571 1_800_692_107 13_762_560 75_532 125_504 91_329
imrc_hashmap_rs 419_256 2_566_397_171 122_454_016 38_485 178_181 115_627
hashmap_rs 413_039 432_766_256 36_536_320 21_628 26_813 24_896

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 127_692 6_314_372_552 29_995_836 692_956 255_044 660_190
heap_rs 410_675 140_170_301 9_109_504 59_205 23_229 59_428

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 135_403 2_751_546 65_508 103_498 878_959 171_998
vector 133_879 2_384_703 24_764 165_680 227_918 218_436
vec_rs 409_470 289_504 655_360 17_114 30_524 25_660

Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 170_111 307_625_902 277_653_343 43_219 29_620
Rust 497_732 82_788_057 56_793_048 49_501 52_193

Certified map

binary_size generate 10k max mem inc witness
Motoko 162_022 23_978_863_464 3_429_924 2_853_149 425_671
Rust 441_297 6_312_527_623 1_081_344 1_004_499 301_497

Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 225_863 47_763 22_599 18_949 20_115
Rust 717_659 532_119 101_647 124_976 135_521

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 183_977 17_658 29_720 8_835
Rust 777_603 141_523 368_093 91_142

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 118_901 23_265
Rust 23_573 1_165

Timer

binary_size setTimer cancelTimer
Motoko 125_145 51_903 4_624
Rust 442_822 67_944 11_048

Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_338_231_390 59_396_776 118 118 118
copying 1_338_231_272 59_396_776 1_337_913_554 1_338_002_356 1_337_919_129
compacting 1_911_420_593 59_396_776 1_473_824_171 1_756_485_051 1_787_369_939
generational 2_918_248_625 59_405_240 1_141_865_950 1_228_537 1_127_895
incremental 33_436_705 1_136_153_832 333_734_166 336_829_512 336_860_690

Actor class

binary size put new bucket put existing bucket get
Map 254_132 699_418 16_214 16_734

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 139_898 126_836 28_710 11_932 22_995 6_405
Rust 479_085 529_109 69_806 42_858 93_140 52_290