dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Use PocketIC for CI #115

Open chenyan-dfinity opened 5 months ago

github-actions[bot] commented 5 months ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Warning Skip table 0 ## Map from _out/collections/README.md, due to table shape mismatches from main branch.

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50.1 upgrade
heap 166_988 ($\textcolor{red}{0.05\%}$) 5_554_617_018 24_000_360 621_690 227_224 592_588 3_189_831_485
heap_rs 576_607 ($\textcolor{red}{6.65\%}$) 144_262_092 ($\textcolor{red}{3.29\%}$) 18_284_544 60_983 ($\textcolor{red}{9.13\%}$) 21_544 ($\textcolor{red}{1.28\%}$) 60_888 ($\textcolor{red}{9.19\%}$) 642_924_555 ($\textcolor{green}{-0.92\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 173_988 ($\textcolor{red}{0.05\%}$) 2_601_059 65_644 95_506 803_474 173_506 3_091_310
vector 172_017 ($\textcolor{red}{0.05\%}$) 1_952_689 24_580 126_130 186_485 176_123 4_675_192
vec_rs 574_751 ($\textcolor{red}{6.73\%}$) 287_152 ($\textcolor{red}{0.13\%}$) 1_376_256 15_975 ($\textcolor{red}{1.66\%}$) 29_064 ($\textcolor{red}{0.94\%}$) 21_824 ($\textcolor{red}{1.26\%}$) 3_717_375 ($\textcolor{green}{-2.32\%}$)

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 592_800 ($\textcolor{red}{6.63\%}$) 76_934_678 ($\textcolor{red}{1.02\%}$) 2_555_904 63_128 ($\textcolor{red}{0.44\%}$) 95_879 ($\textcolor{red}{0.80\%}$) 83_525 ($\textcolor{green}{-0.62\%}$) 138_195_760 ($\textcolor{green}{-1.00\%}$)
btreemap_stable_rs 599_653 ($\textcolor{red}{7.45\%}$) 4_737_161_366 ($\textcolor{red}{3.79\%}$) 2_031_616 2_865_151 ($\textcolor{red}{5.74\%}$) 5_225_233 ($\textcolor{red}{3.85\%}$) 8_806_609 ($\textcolor{red}{2.63\%}$) 729_745 ($\textcolor{red}{0.05\%}$)
heap_rs 576_607 ($\textcolor{red}{6.65\%}$) 7_279_802 ($\textcolor{red}{3.27\%}$) 2_293_760 52_442 ($\textcolor{red}{8.38\%}$) 21_792 ($\textcolor{red}{1.27\%}$) 52_157 ($\textcolor{red}{8.40\%}$) 33_332_862 ($\textcolor{green}{-0.88\%}$)
heap_stable_rs 560_468 ($\textcolor{red}{7.54\%}$) 281_770_028 ($\textcolor{red}{1.41\%}$) 458_752 2_447_410 ($\textcolor{red}{1.74\%}$) 246_056 ($\textcolor{red}{1.33\%}$) 2_428_931 ($\textcolor{red}{1.74\%}$) 729_696 ($\textcolor{red}{0.05\%}$)
vec_rs 574_751 ($\textcolor{red}{6.73\%}$) 3_077_562 ($\textcolor{red}{0.01\%}$) 2_293_760 15_975 ($\textcolor{red}{1.66\%}$) 16_914 ($\textcolor{red}{1.63\%}$) 16_212 ($\textcolor{red}{1.70\%}$) 30_403_059 ($\textcolor{green}{-2.87\%}$)
vec_stable_rs 553_418 ($\textcolor{red}{6.15\%}$) 60_342_671 ($\textcolor{green}{-4.74\%}$) 458_752 66_073 ($\textcolor{red}{1.85\%}$) 76_095 ($\textcolor{green}{-3.44\%}$) 83_515 ($\textcolor{green}{-0.78\%}$) 729_687 ($\textcolor{red}{0.05\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 193_771 ($\textcolor{red}{0.04\%}$) 267_743_355 247_834_501 33_636 24_532
Rust 575_720 ($\textcolor{red}{6.24\%}$) 82_782_549 ($\textcolor{red}{0.00\%}$) 56_788_101 ($\textcolor{red}{0.00\%}$) 42_415 ($\textcolor{red}{3.34\%}$) 40_810 ($\textcolor{red}{2.16\%}$)

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 243_726 ($\textcolor{red}{0.03\%}$) 4_666_119_661 3_430_044 553_629 407_936 274_434_719
Rust 619_546 ($\textcolor{red}{6.73\%}$) 6_409_499_600 ($\textcolor{red}{0.00\%}$) 2_228_224 1_020_428 ($\textcolor{red}{0.01\%}$) 299_393 ($\textcolor{red}{0.40\%}$) 6_025_596_995 ($\textcolor{green}{-0.01\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 274_023 ($\textcolor{red}{0.03\%}$) 510_825 22_316 18_599 ($\textcolor{red}{0.01\%}$) 19_635 ($\textcolor{green}{-0.15\%}$) 157_946 ($\textcolor{green}{-0.01\%}$)
Rust 888_398 ($\textcolor{red}{4.82\%}$) 515_016 ($\textcolor{red}{1.68\%}$) 91_701 ($\textcolor{red}{3.11\%}$) 117_771 ($\textcolor{red}{0.35\%}$) 112_893 ($\textcolor{red}{0.49\%}$) 1_490_782 ($\textcolor{red}{0.50\%}$)

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 220_488 ($\textcolor{red}{0.04\%}$) 481_158 30_204 8_764 89_833
Rust 920_591 ($\textcolor{red}{4.62\%}$) 203_585 ($\textcolor{red}{0.04\%}$) 302_804 ($\textcolor{green}{-0.08\%}$) 68_166 ($\textcolor{green}{-4.31\%}$) 1_617_697 ($\textcolor{green}{-0.75\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 137_268 ($\textcolor{red}{0.06\%}$) 19_507
Rust 26_044 ($\textcolor{red}{10.09\%}$) 1_191 ($\textcolor{red}{148.12\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 145_704 ($\textcolor{red}{0.06\%}$) 51_778 4_626
Rust 535_525 ($\textcolor{red}{6.52\%}$) 64_193 ($\textcolor{red}{1.28\%}$) 11_797 ($\textcolor{red}{1.04\%}$)

Statistics

Garbage Collection

Note Same as main branch, skipping.

Actor class

binary size put new bucket put existing bucket get
Map 299_302 ($\textcolor{red}{0.03\%}$) 813_521 16_115 16_660

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 161_375 ($\textcolor{red}{0.05\%}$) 145_826 ($\textcolor{red}{0.06\%}$) 28_593 11_963 22_854 6_446
Rust 570_196 ($\textcolor{red}{6.39\%}$) 606_522 ($\textcolor{red}{5.70\%}$) 58_150 ($\textcolor{red}{1.15\%}$) 38_553 ($\textcolor{red}{2.00\%}$) 72_692 ($\textcolor{red}{1.88\%}$) 42_681 ($\textcolor{red}{0.55\%}$)

Statistics

github-actions[bot] commented 5 months ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko. The _stable and _stable_rs suffix represents that the library directly writes the state to stable memory using Region in Motoko and ic-stable-stuctures in Rust.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.
  • The upgrade column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is that ic-stable-structures tends to be slower than the region memory in Motoko.
  • Different library has different ways for persisting data during upgrades, there are mainly three categories:
    • Use stable variable directly in Motoko: zhenya_hashmap, btree, vector
    • Expose and serialize external state (share/unshare in Motoko, candid::Encode in Rust): rbtree, heap, btreemap_rs, hashmap_rs, heap_rs, vector_rs
    • Use pre/post-upgrade hooks to convert data into an array: hashmap, triemap, buffer, imrc_hashmap_rs
  • The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 190_014 8_184_618_025 56_000_256 342_784 6_462_528_122 368_420 10_728_193_099
triemap 195_557 13_661_315_924 68_228_576 252_649 657_794 648_084 15_499_470_884
rbtree 185_992 7_009_043_570 52_000_464 116_348 318_320 330_226 6_870_900_152
btree 230_406 10_223_929_607 25_108_416 357_912 485_794 539_490 2_861_974_825
zhenya_hashmap 188_979 2_360_638_679 16_777_504 58_204 66_552 79_675 3_018_208_083
btreemap_rs 592_800 1_808_073_617 27_590_656 73_570 124_009 84_481 3_176_207_795
imrc_hashmap_rs 593_795 2_577_499_644 244_908_032 35_385 194_881 92_505 6_253_420_470
hashmap_rs 581_570 435_576_576 73_138_176 20_263 25_195 23_506 1_521_942_969

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50 upgrade
heap 166_988 5_554_617_018 24_000_360 621_690 227_224 592_588 3_189_831_485
heap_rs 576_607 144_262_092 18_284_544 60_983 21_544 60_888 642_924_555

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 173_988 2_601_059 65_644 95_506 803_474 173_506 3_091_310
vector 172_017 1_952_689 24_580 126_130 186_485 176_123 4_675_192
vec_rs 574_751 287_152 1_376_256 15_975 29_064 21_824 3_717_375

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 592_800 76_934_678 2_555_904 63_128 95_879 83_525 138_195_760
btreemap_stable_rs 599_653 4_737_161_366 2_031_616 2_865_151 5_225_233 8_806_609 729_745
heap_rs 576_607 7_279_802 2_293_760 52_442 21_792 52_157 33_332_862
heap_stable_rs 560_468 281_770_028 458_752 2_447_410 246_056 2_428_931 729_696
vec_rs 574_751 3_077_562 2_293_760 15_975 16_914 16_212 30_403_059
vec_stable_rs 553_418 60_342_671 458_752 66_073 76_095 83_515 729_687

Environment

  • dfx 0.20.2-beta.0
  • Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)
  • rustc 1.78.0 (9b00956e5 2024-04-29)
  • ic-repl 0.7.4
  • ic-wasm 0.7.1

    Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 193_771 267_743_355 247_834_501 33_636 24_532
Rust 575_720 82_782_549 56_788_101 42_415 40_810

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 243_726 4_666_119_661 3_430_044 553_629 407_936 274_434_719
Rust 619_546 6_409_499_600 2_228_224 1_020_428 299_393 6_025_596_995

Environment

  • dfx 0.20.2-beta.0
  • Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)
  • rustc 1.78.0 (9b00956e5 2024-04-29)
  • ic-repl 0.7.4
  • ic-wasm 0.7.1

    Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 274_023 510_825 22_316 18_599 19_635 157_946
Rust 888_398 515_016 91_701 117_771 112_893 1_490_782

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 220_488 481_158 30_204 8_764 89_833
Rust 920_591 203_585 302_804 68_166 1_617_697

Environment

  • dfx 0.20.2-beta.0
  • Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)
  • rustc 1.78.0 (9b00956e5 2024-04-29)
  • ic-repl 0.7.4
  • ic-wasm 0.7.1

    Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 137_268 19_507
Rust 26_044 1_191

Timer

binary_size setTimer cancelTimer
Motoko 145_704 51_778 4_626
Rust 535_525 64_193 11_797

Environment

  • dfx 0.20.2-beta.0
  • Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)
  • rustc 1.78.0 (9b00956e5 2024-04-29)
  • ic-repl 0.7.4
  • ic-wasm 0.7.1

    Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_068_192_695 47_793_792 119 119 119
copying 1_068_192_577 47_793_792 1_067_924_316 1_068_004_203 1_067_925_853
compacting 1_545_586_176 47_793_792 1_192_139_528 1_415_425_189 1_439_317_325
generational 2_304_140_531 47_802_256 882_208_645 1_211_144 1_103_549
incremental 29_503_170 976_097_188 471_911_803 497_465_467 1_221_308_722

Actor class

binary size put new bucket put existing bucket get
Map 299_302 813_521 16_115 16_660

Environment

  • dfx 0.20.2-beta.0
  • Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)
  • rustc 1.78.0 (9b00956e5 2024-04-29)
  • ic-repl 0.7.4
  • ic-wasm 0.7.1

    Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 161_375 145_826 28_593 11_963 22_854 6_446
Rust 570_196 606_522 58_150 38_553 72_692 42_681

Environment

  • dfx 0.20.2-beta.0
  • Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)
  • rustc 1.78.0 (9b00956e5 2024-04-29)
  • ic-repl 0.7.4
  • ic-wasm 0.7.1