dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

bump ic-repl #96

Closed chenyan-dfinity closed 1 year ago

chenyan-dfinity commented 1 year ago
github-actions[bot] commented 1 year ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 160_221 ($\textcolor{red}{0.12\%}$) 6_984_044_999 ($\textcolor{red}{0.00\%}$) 61_987_852 ($\textcolor{red}{0.00\%}$) 288_670 5_536_856_410 ($\textcolor{green}{-0.00\%}$) 310_195 9_128_784_003 ($\textcolor{red}{0.00\%}$)
triemap 163_474 ($\textcolor{red}{0.12\%}$) 11_463_655_150 ($\textcolor{green}{-0.00\%}$) 74_216_172 ($\textcolor{red}{0.00\%}$) 222_926 549_435 540_205 13_075_158_546 ($\textcolor{red}{0.00\%}$)
rbtree 158_149 ($\textcolor{red}{0.12\%}$) 5_979_229_900 ($\textcolor{green}{-0.00\%}$) 57_996_060 ($\textcolor{red}{0.00\%}$) 88_905 268_573 278_352 ($\textcolor{red}{0.00\%}$) 5_771_880_608 ($\textcolor{red}{0.00\%}$)
splay 159_956 ($\textcolor{red}{0.12\%}$) 11_568_250_103 ($\textcolor{green}{-0.00\%}$) 53_995_996 ($\textcolor{red}{0.00\%}$) 552_014 581_765 810_321 3_722_474_749 ($\textcolor{red}{0.00\%}$)
btree 187_897 ($\textcolor{red}{0.10\%}$) 8_224_242_789 ($\textcolor{red}{0.00\%}$) 31_104_012 ($\textcolor{red}{0.00\%}$) 277_542 384_171 429_041 2_517_941_583 ($\textcolor{red}{0.00\%}$)
zhenya_hashmap 160_509 ($\textcolor{red}{0.12\%}$) 2_201_622_562 ($\textcolor{red}{0.00\%}$) 22_773_100 ($\textcolor{red}{0.00\%}$) 48_627 61_839 70_872 2_695_448_620 ($\textcolor{red}{0.00\%}$)
btreemap_rs 477_612 ($\textcolor{green}{-3.37\%}$) 1_651_590_463 ($\textcolor{green}{-0.15\%}$) 27_590_656 66_862 ($\textcolor{green}{-0.04\%}$) 112_477 ($\textcolor{green}{-0.11\%}$) 76_234 ($\textcolor{green}{-6.17\%}$) 2_660_975_747 ($\textcolor{red}{10.82\%}$)
imrc_hashmap_rs 479_773 ($\textcolor{green}{-4.08\%}$) 2_392_906_831 ($\textcolor{green}{-0.59\%}$) 244_973_568 32_763 ($\textcolor{green}{-0.60\%}$) 163_245 ($\textcolor{green}{-0.41\%}$) 98_394 ($\textcolor{green}{-0.20\%}$) 5_191_575_323 ($\textcolor{green}{-0.35\%}$)
hashmap_rs 467_997 ($\textcolor{green}{-4.10\%}$) 403_296_648 ($\textcolor{red}{0.00\%}$) 73_138_176 16_851 ($\textcolor{green}{-2.88\%}$) 21_680 ($\textcolor{red}{0.15\%}$) 20_263 ($\textcolor{green}{-1.71\%}$) 1_144_828_025 ($\textcolor{red}{19.55\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50.1 upgrade
heap 147_638 ($\textcolor{red}{0.13\%}$) 4_684_519_403 ($\textcolor{red}{0.00\%}$) 29_995_956 ($\textcolor{red}{0.00\%}$) 511_505 186_471 487_225 ($\textcolor{red}{0.00\%}$) 2_655_609_909 ($\textcolor{red}{0.00\%}$)
heap_rs 463_840 ($\textcolor{green}{-3.72\%}$) 121_602_221 ($\textcolor{green}{-1.22\%}$) 18_284_544 51_661 ($\textcolor{green}{-3.40\%}$) 18_245 ($\textcolor{green}{-0.10\%}$) 51_802 ($\textcolor{green}{-3.39\%}$) 440_739_988 ($\textcolor{red}{26.28\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 151_004 ($\textcolor{red}{0.12\%}$) 2_082_623 65_644 ($\textcolor{red}{0.09\%}$) 73_092 671_517 127_592 2_474_639 ($\textcolor{red}{0.26\%}$)
vector 152_551 ($\textcolor{red}{0.12\%}$) 1_588_260 24_580 ($\textcolor{red}{0.24\%}$) 105_191 149_932 148_094 3_844_445 ($\textcolor{red}{0.17\%}$)
vec_rs 459_655 ($\textcolor{green}{-4.40\%}$) 265_683 ($\textcolor{red}{0.02\%}$) 1_310_720 ($\textcolor{green}{-4.76\%}$) 13_014 ($\textcolor{red}{0.22\%}$) 25_363 ($\textcolor{red}{0.13\%}$) 21_247 ($\textcolor{red}{0.15\%}$) 2_743_831 ($\textcolor{green}{-3.88\%}$)

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 477_612 ($\textcolor{green}{-3.37\%}$) 70_026_986 ($\textcolor{green}{-0.29\%}$) 2_555_904 57_181 ($\textcolor{green}{-0.05\%}$) 86_494 ($\textcolor{green}{-0.25\%}$) 75_309 ($\textcolor{green}{-5.56\%}$) 113_837_931 ($\textcolor{red}{13.30\%}$)
btreemap_stable_rs 478_668 ($\textcolor{green}{-3.97\%}$) 4_224_209_849 ($\textcolor{red}{14.91\%}$) 2_621_440 2_528_769 ($\textcolor{red}{15.43\%}$) 4_605_548 ($\textcolor{red}{14.75\%}$) 7_817_380 ($\textcolor{red}{15.35\%}$) 653_359 ($\textcolor{green}{-8.56\%}$)
heap_rs 463_840 ($\textcolor{green}{-3.72\%}$) 6_139_838 ($\textcolor{green}{-1.21\%}$) 2_293_760 44_362 ($\textcolor{green}{-3.06\%}$) 18_477 ($\textcolor{green}{-0.10\%}$) 44_345 ($\textcolor{green}{-3.03\%}$) 23_149_372 ($\textcolor{red}{26.03\%}$)
heap_stable_rs 451_018 ($\textcolor{green}{-3.99\%}$) 279_422_369 ($\textcolor{red}{16.24\%}$) 458_752 2_346_843 ($\textcolor{red}{15.12\%}$) 241_158 ($\textcolor{red}{15.36\%}$) 2_329_183 ($\textcolor{red}{15.11\%}$) 653_433 ($\textcolor{green}{-8.54\%}$)
vec_rs 459_655 ($\textcolor{green}{-4.40\%}$) 2_866_886 ($\textcolor{red}{0.00\%}$) 2_228_224 ($\textcolor{green}{-2.86\%}$) 13_014 ($\textcolor{red}{0.22\%}$) 14_113 ($\textcolor{red}{0.23\%}$) 13_710 ($\textcolor{red}{0.23\%}$) 21_249_908 ($\textcolor{red}{28.20\%}$)
vec_stable_rs 446_031 ($\textcolor{green}{-4.16\%}$) 65_186_210 ($\textcolor{red}{17.27\%}$) 458_752 58_992 ($\textcolor{red}{12.05\%}$) 77_387 ($\textcolor{red}{14.23\%}$) 79_383 ($\textcolor{red}{13.99\%}$) 653_447 ($\textcolor{green}{-8.54\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 173_220 ($\textcolor{red}{0.11\%}$) 247_480_401 228_033_044 30_017 20_760
Rust 477_934 ($\textcolor{green}{-4.07\%}$) 82_511_961 ($\textcolor{red}{0.00\%}$) 56_525_962 ($\textcolor{green}{-0.00\%}$) 42_432 ($\textcolor{green}{-0.11\%}$) 44_430 ($\textcolor{green}{-0.02\%}$)

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 206_484 ($\textcolor{red}{0.09\%}$) 4_390_019_361 ($\textcolor{red}{0.00\%}$) 3_430_044 ($\textcolor{red}{0.00\%}$) 519_711 327_767 225_153_243 ($\textcolor{red}{0.00\%}$)
Rust 500_582 ($\textcolor{green}{-4.10\%}$) 6_199_155_471 ($\textcolor{green}{-0.05\%}$) 2_228_224 983_538 ($\textcolor{green}{-0.05\%}$) 288_338 ($\textcolor{green}{-0.07\%}$) 5_816_678_045 ($\textcolor{red}{0.09\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 236_862 ($\textcolor{red}{0.08\%}$) 497_594 ($\textcolor{red}{1.18\%}$) 16_248 ($\textcolor{green}{-0.26\%}$) 12_674 ($\textcolor{red}{0.02\%}$) 14_115 ($\textcolor{green}{-0.15\%}$) 128_956 ($\textcolor{red}{5.32\%}$)
Rust 779_543 ($\textcolor{green}{-3.35\%}$) 548_099 ($\textcolor{red}{1.26\%}$) 86_604 ($\textcolor{red}{0.64\%}$) 105_959 ($\textcolor{green}{-1.24\%}$) 117_903 ($\textcolor{red}{0.72\%}$) 1_624_526 ($\textcolor{green}{-3.68\%}$)

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 195_127 ($\textcolor{red}{0.10\%}$) 472_267 ($\textcolor{red}{1.25\%}$) 22_357 4_729 71_602 ($\textcolor{red}{9.13\%}$)
Rust 799_705 ($\textcolor{green}{-2.58\%}$) 217_270 ($\textcolor{red}{3.42\%}$) 325_723 ($\textcolor{red}{0.42\%}$) 78_145 ($\textcolor{green}{-3.55\%}$) 1_797_952 ($\textcolor{green}{-3.36\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 123_696 ($\textcolor{red}{0.15\%}$) 7_399 ($\textcolor{red}{96.89\%}$)
Rust 23_839 ($\textcolor{red}{0.05\%}$) 785

Timer

binary_size setTimer cancelTimer
Motoko 129_966 ($\textcolor{red}{0.14\%}$) 15_227 1_684
Rust 422_512 ($\textcolor{green}{-4.29\%}$) 43_471 ($\textcolor{red}{0.01\%}$) 7_569 ($\textcolor{green}{-0.33\%}$)

Statistics

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 886_042_039 ($\textcolor{red}{0.00\%}$) 51_991_392 ($\textcolor{red}{0.00\%}$) 50 50 50
copying 886_041_989 ($\textcolor{red}{0.00\%}$) 51_991_392 ($\textcolor{red}{0.00\%}$) 886_022_376 ($\textcolor{red}{0.00\%}$) 886_091_464 ($\textcolor{red}{0.00\%}$) 886_024_532 ($\textcolor{red}{0.00\%}$)
compacting 1_465_238_570 ($\textcolor{green}{-0.00\%}$) 51_991_392 ($\textcolor{red}{0.00\%}$) 1_131_731_091 ($\textcolor{green}{-0.00\%}$) 1_337_770_735 ($\textcolor{red}{0.00\%}$) 1_364_176_230 ($\textcolor{red}{0.00\%}$)
generational 2_184_682_993 ($\textcolor{green}{-0.00\%}$) 51_999_856 ($\textcolor{red}{0.00\%}$) 855_707_553 1_057_808 ($\textcolor{red}{0.00\%}$) 947_924 ($\textcolor{red}{0.01\%}$)
incremental 28_518_613 985_885_652 ($\textcolor{red}{0.00\%}$) 290_276_212 292_998_697 292_988_797

Actor class

binary size put new bucket put existing bucket get
Map 261_665 ($\textcolor{red}{0.07\%}$) 654_501 4_459 4_919

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 144_769 ($\textcolor{red}{0.13\%}$) 131_630 ($\textcolor{red}{0.14\%}$) 14_651 8_456 10_539 3_669
Rust 457_215 ($\textcolor{green}{-4.23\%}$) 510_788 ($\textcolor{green}{-3.10\%}$) 51_624 ($\textcolor{red}{0.25\%}$) 34_412 ($\textcolor{green}{-0.21\%}$) 74_396 ($\textcolor{red}{0.24\%}$) 44_011 ($\textcolor{green}{-0.27\%}$)

Statistics

github-actions[bot] commented 1 year ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko. The _stable and _stable_rs suffix represents that the library directly writes the state to stable memory using Region in Motoko and ic-stable-stuctures in Rust.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.
  • The upgrade column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is that ic-stable-structures tends to be slower than the region memory in Motoko.
  • Different library has different ways for persisting data during upgrades, there are mainly three categories:
    • Use stable variable directly in Motoko: zhenya_hashmap, btree, vector
    • Expose and serialize external state (share/unshare in Motoko, candid::Encode in Rust): rbtree, heap, btreemap_rs, hashmap_rs, heap_rs, vector_rs
    • Use pre/post-upgrade hooks to convert data into an array: hashmap, splay, triemap, buffer, imrc_hashmap_rs
  • The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 160_221 6_984_044_999 61_987_852 288_670 5_536_856_410 310_195 9_128_784_003
triemap 163_474 11_463_655_150 74_216_172 222_926 549_435 540_205 13_075_158_546
rbtree 158_149 5_979_229_900 57_996_060 88_905 268_573 278_352 5_771_880_608
splay 159_956 11_568_250_103 53_995_996 552_014 581_765 810_321 3_722_474_749
btree 187_897 8_224_242_789 31_104_012 277_542 384_171 429_041 2_517_941_583
zhenya_hashmap 160_509 2_201_622_562 22_773_100 48_627 61_839 70_872 2_695_448_620
btreemap_rs 477_612 1_651_590_463 27_590_656 66_862 112_477 76_234 2_660_975_747
imrc_hashmap_rs 479_773 2_392_906_831 244_973_568 32_763 163_245 98_394 5_191_575_323
hashmap_rs 467_997 403_296_648 73_138_176 16_851 21_680 20_263 1_144_828_025

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50 upgrade
heap 147_638 4_684_519_403 29_995_956 511_505 186_471 487_225 2_655_609_909
heap_rs 463_840 121_602_221 18_284_544 51_661 18_245 51_802 440_739_988

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 151_004 2_082_623 65_644 73_092 671_517 127_592 2_474_639
vector 152_551 1_588_260 24_580 105_191 149_932 148_094 3_844_445
vec_rs 459_655 265_683 1_310_720 13_014 25_363 21_247 2_743_831

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 477_612 70_026_986 2_555_904 57_181 86_494 75_309 113_837_931
btreemap_stable_rs 478_668 4_224_209_849 2_621_440 2_528_769 4_605_548 7_817_380 653_359
heap_rs 463_840 6_139_838 2_293_760 44_362 18_477 44_345 23_149_372
heap_stable_rs 451_018 279_422_369 458_752 2_346_843 241_158 2_329_183 653_433
vec_rs 459_655 2_866_886 2_228_224 13_014 14_113 13_710 21_249_908
vec_stable_rs 446_031 65_186_210 458_752 58_992 77_387 79_383 653_447

Environment

  • dfx 0.15.1
  • Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
  • rustc 1.73.0 (cc66ad468 2023-10-03)
  • ic-repl 0.5.1
  • ic-wasm 0.6.0

    Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 173_220 247_480_401 228_033_044 30_017 20_760
Rust 477_934 82_511_961 56_525_962 42_432 44_430

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 206_484 4_390_019_361 3_430_044 519_711 327_767 225_153_243
Rust 500_582 6_199_155_471 2_228_224 983_538 288_338 5_816_678_045

Environment

  • dfx 0.15.1
  • Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
  • rustc 1.73.0 (cc66ad468 2023-10-03)
  • ic-repl 0.5.1
  • ic-wasm 0.6.0

    Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 236_862 497_594 16_248 12_674 14_115 128_956
Rust 779_543 548_099 86_604 105_959 117_903 1_624_526

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 195_127 472_267 22_357 4_729 71_602
Rust 799_705 217_270 325_723 78_145 1_797_952

Environment

  • dfx 0.15.1
  • Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
  • rustc 1.73.0 (cc66ad468 2023-10-03)
  • ic-repl 0.5.1
  • ic-wasm 0.6.0

    Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 123_696 7_399
Rust 23_839 785

Timer

binary_size setTimer cancelTimer
Motoko 129_966 15_227 1_684
Rust 422_512 43_471 7_569

Environment

  • dfx 0.15.1
  • Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
  • rustc 1.73.0 (cc66ad468 2023-10-03)
  • ic-repl 0.5.1
  • ic-wasm 0.6.0

    Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 886_042_039 51_991_392 50 50 50
copying 886_041_989 51_991_392 886_022_376 886_091_464 886_024_532
compacting 1_465_238_570 51_991_392 1_131_731_091 1_337_770_735 1_364_176_230
generational 2_184_682_993 51_999_856 855_707_553 1_057_808 947_924
incremental 28_518_613 985_885_652 290_276_212 292_998_697 292_988_797

Actor class

binary size put new bucket put existing bucket get
Map 261_665 654_501 4_459 4_919

Environment

  • dfx 0.15.1
  • Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
  • rustc 1.73.0 (cc66ad468 2023-10-03)
  • ic-repl 0.5.1
  • ic-wasm 0.6.0

    Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 144_769 131_630 14_651 8_456 10_539 3_669
Rust 457_215 510_788 51_624 34_412 74_396 44_011

Environment

  • dfx 0.15.1
  • Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
  • rustc 1.73.0 (cc66ad468 2023-10-03)
  • ic-repl 0.5.1
  • ic-wasm 0.6.0