dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Bump dependencies #89

Closed chenyan-dfinity closed 11 months ago

chenyan-dfinity commented 11 months ago
github-actions[bot] commented 11 months ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 138_275 6_974_058_129 61_987_732 288_202 5_527_868_856 309_728
triemap 139_765 11_432_083_637 74_216_052 222_825 547_701 539_052
rbtree 140_562 5_979_229_508 57_995_940 88_905 268_573 278_352
splay 136_342 11_568_250_621 53_995_876 551_926 581_651 810_220
btree 181_449 8_224_241_444 31_103_892 277_542 384_171 429_041
zhenya_hashmap 153_793 ($\textcolor{red}{4.99\%}$) 2_201_621_425 ($\textcolor{green}{-16.42\%}$) 22_772_980 ($\textcolor{green}{-65.49\%}$) 48_627 ($\textcolor{green}{-25.64\%}$) 61_839 ($\textcolor{green}{-22.90\%}$) 70_872 ($\textcolor{green}{-25.26\%}$)
btreemap_rs 418_496 ($\textcolor{green}{-0.37\%}$) 1_654_114_123 ($\textcolor{green}{-0.00\%}$) 13_762_560 66_828 ($\textcolor{green}{-0.09\%}$) 112_500 ($\textcolor{green}{-0.06\%}$) 81_246 ($\textcolor{green}{-0.08\%}$)
imrc_hashmap_rs 418_054 ($\textcolor{green}{-0.40\%}$) 2_386_381_040 ($\textcolor{green}{-0.00\%}$) 122_454_016 32_841 ($\textcolor{green}{-0.19\%}$) 162_760 ($\textcolor{green}{-0.04\%}$) 98_464 ($\textcolor{green}{-0.06\%}$)
hashmap_rs 411_843 ($\textcolor{green}{-0.41\%}$) 402_296_785 ($\textcolor{green}{-0.00\%}$) 36_536_320 16_635 ($\textcolor{green}{-0.37\%}$) 21_539 ($\textcolor{green}{-0.29\%}$) 19_990 ($\textcolor{green}{-0.31\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 132_227 4_684_517_324 29_995_836 511_499 186_465
heap_rs 409_392 ($\textcolor{green}{-0.43\%}$) 123_102_351 ($\textcolor{green}{-0.00\%}$) 9_109_504 53_320 ($\textcolor{green}{-0.12\%}$) 18_140 ($\textcolor{green}{-0.34\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 139_908 2_082_623 65_508 73_092 671_517 127_592
vector 138_344 1_728_571 24_764 121_219 163_947 161_609
vec_rs 408_280 ($\textcolor{green}{-0.41\%}$) 265_791 ($\textcolor{green}{-0.02\%}$) 655_360 12_840 ($\textcolor{green}{-0.48\%}$) 25_269 ($\textcolor{green}{-0.24\%}$) 21_153 ($\textcolor{green}{-0.29\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 172_890 247_480_401 228_033_044 30_017 20_760
Rust 497_739 ($\textcolor{green}{-0.11\%}$) 82_511_907 ($\textcolor{green}{-0.00\%}$) 56_525_950 ($\textcolor{green}{-0.00\%}$) 42_406 ($\textcolor{green}{-0.34\%}$) 44_341 ($\textcolor{green}{-0.52\%}$)

Certified map

binary_size generate 10k max mem inc witness
Motoko 176_829 4_390_018_085 3_429_924 519_711 327_767
Rust 440_119 ($\textcolor{green}{-0.38\%}$) 6_202_162_996 ($\textcolor{green}{-0.00\%}$) 1_081_344 983_928 ($\textcolor{red}{0.00\%}$) 288_414 ($\textcolor{green}{-0.02\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 230_182 37_638 ($\textcolor{red}{0.19\%}$) 16_286 ($\textcolor{red}{0.21\%}$) 12_712 ($\textcolor{red}{0.28\%}$) 14_184 ($\textcolor{red}{0.48\%}$)
Rust 713_090 ($\textcolor{green}{-0.74\%}$) 469_329 ($\textcolor{green}{-0.65\%}$) 86_401 ($\textcolor{green}{-0.44\%}$) 104_729 ($\textcolor{green}{-0.51\%}$) 115_792 ($\textcolor{green}{-0.38\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 188_321 12_267 22_357 4_729
Rust 776_901 ($\textcolor{green}{-0.18\%}$) 124_454 ($\textcolor{green}{-0.67\%}$) 325_566 ($\textcolor{red}{0.17\%}$) 80_361 ($\textcolor{red}{3.69\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 123_357 7_399
Rust 23_625 469 ($\textcolor{green}{-40.25\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 129_636 15_227 1_684
Rust 442_239 ($\textcolor{green}{-0.25\%}$) 43_295 ($\textcolor{green}{-0.28\%}$) 7_521 ($\textcolor{red}{0.32\%}$)

Statistics

Actor class

binary size put new bucket put existing bucket get
Map 261_335 ($\textcolor{green}{-0.10\%}$) 654_501 ($\textcolor{green}{-0.24\%}$) 4_459 4_919

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 144_439 131_299 14_651 8_456 10_539 3_669
Rust 476_238 ($\textcolor{green}{-0.70\%}$) 525_832 ($\textcolor{green}{-0.72\%}$) 51_355 ($\textcolor{green}{-0.72\%}$) 34_433 ($\textcolor{green}{-0.47\%}$) 74_154 ($\textcolor{green}{-0.92\%}$) 44_068 ($\textcolor{green}{-0.66\%}$)

Statistics

github-actions[bot] commented 11 months ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 138_275 6_974_058_129 61_987_732 288_202 5_527_868_856 309_728
triemap 139_765 11_432_083_637 74_216_052 222_825 547_701 539_052
rbtree 140_562 5_979_229_508 57_995_940 88_905 268_573 278_352
splay 136_342 11_568_250_621 53_995_876 551_926 581_651 810_220
btree 181_449 8_224_241_444 31_103_892 277_542 384_171 429_041
zhenya_hashmap 153_793 2_201_621_425 22_772_980 48_627 61_839 70_872
btreemap_rs 418_496 1_654_114_123 13_762_560 66_828 112_500 81_246
imrc_hashmap_rs 418_054 2_386_381_040 122_454_016 32_841 162_760 98_464
hashmap_rs 411_843 402_296_785 36_536_320 16_635 21_539 19_990

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 132_227 4_684_517_324 29_995_836 511_499 186_465 487_206
heap_rs 409_392 123_102_351 9_109_504 53_320 18_140 53_545

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 139_908 2_082_623 65_508 73_092 671_517 127_592
vector 138_344 1_728_571 24_764 121_219 163_947 161_609
vec_rs 408_280 265_791 655_360 12_840 25_269 21_153

Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 172_890 247_480_401 228_033_044 30_017 20_760
Rust 497_739 82_511_907 56_525_950 42_406 44_341

Certified map

binary_size generate 10k max mem inc witness
Motoko 176_829 4_390_018_085 3_429_924 519_711 327_767
Rust 440_119 6_202_162_996 1_081_344 983_928 288_414

Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 230_182 37_638 16_286 12_712 14_184
Rust 713_090 469_329 86_401 104_729 115_792

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 188_321 12_267 22_357 4_729
Rust 776_901 124_454 325_566 80_361

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 123_357 7_399
Rust 23_625 469

Timer

binary_size setTimer cancelTimer
Motoko 129_636 15_227 1_684
Rust 442_239 43_295 7_521

Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 886_040_881 51_991_272 50 50 50
copying 886_040_831 51_991_272 886_021_215 886_090_303 886_023_374
compacting 1_465_250_036 51_991_272 1_131_730_142 1_337_769_727 1_364_175_167
generational 2_184_686_556 51_999_736 855_706_700 1_058_853 948_937
incremental 28_518_084 985_883_928 290_276_117 292_998_383 292_988_797

Actor class

binary size put new bucket put existing bucket get
Map 261_335 654_501 4_459 4_919

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 144_439 131_299 14_651 8_456 10_539 3_669
Rust 476_238 525_832 51_355 34_433 74_154 44_068