dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Bump dependencies #116

Closed chenyan-dfinity closed 1 month ago

chenyan-dfinity commented 2 months ago

Stale certificate seems to only happen on mac, not on ubuntu

github-actions[bot] commented 2 months ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 193_766 ($\textcolor{red}{2.02\%}$) 8_193_819_060 ($\textcolor{red}{0.11\%}$) 56_000_256 342_788 ($\textcolor{red}{0.00\%}$) 6_469_781_020 ($\textcolor{red}{0.11\%}$) 368_431 ($\textcolor{red}{0.00\%}$) 10_766_389_355 ($\textcolor{red}{0.36\%}$)
triemap 199_308 ($\textcolor{red}{1.96\%}$) 13_670_187_584 ($\textcolor{red}{0.06\%}$) 68_228_576 252_704 ($\textcolor{red}{0.02\%}$) 657_887 ($\textcolor{red}{0.01\%}$) 648_184 ($\textcolor{red}{0.02\%}$) 15_537_338_013 ($\textcolor{red}{0.24\%}$)
rbtree 189_827 ($\textcolor{red}{2.11\%}$) 6_993_676_959 ($\textcolor{green}{-0.22\%}$) 52_000_464 116_417 ($\textcolor{red}{0.06\%}$) 317_299 ($\textcolor{green}{-0.32\%}$) 330_296 ($\textcolor{red}{0.02\%}$) 6_988_883_198 ($\textcolor{red}{1.72\%}$)
splay 194_386 ($\textcolor{red}{2.06\%}$) 13_052_525_320 ($\textcolor{green}{-0.80\%}$) 48_000_400 625_852 ($\textcolor{green}{-0.87\%}$) 657_023 ($\textcolor{green}{-0.90\%}$) 920_272 ($\textcolor{green}{-0.85\%}$) 4_321_903_638 ($\textcolor{red}{0.30\%}$)
btree 234_067 ($\textcolor{red}{1.63\%}$) 10_220_059_976 ($\textcolor{green}{-0.04\%}$) 25_108_416 357_581 ($\textcolor{green}{-0.09\%}$) 485_463 ($\textcolor{green}{-0.07\%}$) 539_509 ($\textcolor{red}{0.00\%}$) 2_906_462_018 ($\textcolor{red}{1.55\%}$)
zhenya_hashmap 192_598 ($\textcolor{red}{1.96\%}$) 2_361_649_032 ($\textcolor{red}{0.04\%}$) 16_777_504 58_299 ($\textcolor{red}{0.16\%}$) 66_594 ($\textcolor{red}{0.06\%}$) 79_776 ($\textcolor{red}{0.13\%}$) 3_084_235_946 ($\textcolor{red}{2.19\%}$)
btreemap_rs 611_851 ($\textcolor{red}{10.06\%}$) 1_809_789_841 ($\textcolor{red}{0.96\%}$) 27_590_656 74_098 ($\textcolor{red}{1.10\%}$) 124_626 ($\textcolor{red}{1.16\%}$) 85_214 ($\textcolor{red}{0.20\%}$) 3_208_130_200 ($\textcolor{red}{0.12\%}$)
imrc_hashmap_rs 613_202 ($\textcolor{red}{10.26\%}$) 2_634_915_707 ($\textcolor{red}{1.97\%}$) 244_908_032 35_894 ($\textcolor{red}{0.98\%}$) 198_252 ($\textcolor{red}{1.71\%}$) 96_520 ($\textcolor{red}{3.61\%}$) 6_383_840_797 ($\textcolor{red}{1.78\%}$)
hashmap_rs 601_477 ($\textcolor{red}{10.45\%}$) 438_103_157 ($\textcolor{green}{-0.26\%}$) 73_138_176 20_788 ($\textcolor{red}{3.90\%}$) 25_678 ($\textcolor{red}{3.00\%}$) 23_645 ($\textcolor{red}{1.73\%}$) 1_545_701_419 ($\textcolor{green}{-1.27\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50.1 upgrade
heap 170_697 ($\textcolor{red}{2.27\%}$) 5_557_564_409 ($\textcolor{red}{0.05\%}$) 24_000_360 621_758 ($\textcolor{red}{0.01\%}$) 227_293 ($\textcolor{red}{0.03\%}$) 592_698 ($\textcolor{red}{0.02\%}$) 3_240_817_053 ($\textcolor{red}{1.60\%}$)
heap_rs 596_953 ($\textcolor{red}{10.41\%}$) 143_262_451 ($\textcolor{red}{2.57\%}$) 18_284_544 58_563 ($\textcolor{red}{4.80\%}$) 21_622 ($\textcolor{red}{1.65\%}$) 58_466 ($\textcolor{red}{4.84\%}$) 647_923_463 ($\textcolor{green}{-0.15\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 177_754 ($\textcolor{red}{2.21\%}$) 2_601_290 ($\textcolor{red}{0.01\%}$) 65_652 ($\textcolor{red}{0.01\%}$) 95_575 ($\textcolor{red}{0.07\%}$) 803_545 ($\textcolor{red}{0.01\%}$) 173_575 ($\textcolor{red}{0.04\%}$) 3_146_134 ($\textcolor{red}{1.77\%}$)
vector 175_556 ($\textcolor{red}{2.11\%}$) 1_952_750 ($\textcolor{red}{0.00\%}$) 24_588 ($\textcolor{red}{0.03\%}$) 126_199 ($\textcolor{red}{0.05\%}$) 186_554 ($\textcolor{red}{0.04\%}$) 176_192 ($\textcolor{red}{0.04\%}$) 4_779_726 ($\textcolor{red}{2.24\%}$)
vec_rs 588_969 ($\textcolor{red}{9.37\%}$) 287_516 ($\textcolor{red}{0.26\%}$) 1_376_256 16_494 ($\textcolor{red}{4.96\%}$) 30_089 ($\textcolor{red}{4.50\%}$) 22_346 ($\textcolor{red}{3.68\%}$) 3_806_788 ($\textcolor{red}{0.03\%}$)

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 611_851 ($\textcolor{red}{10.06\%}$) 77_021_026 ($\textcolor{red}{1.13\%}$) 2_555_904 63_656 ($\textcolor{red}{1.28\%}$) 96_504 ($\textcolor{red}{1.46\%}$) 84_265 ($\textcolor{red}{0.26\%}$) 139_792_280 ($\textcolor{red}{0.14\%}$)
btreemap_stable_rs 616_876 ($\textcolor{red}{10.53\%}$) 4_773_834_814 ($\textcolor{red}{4.60\%}$) 2_031_616 2_893_685 ($\textcolor{red}{6.79\%}$) 5_266_123 ($\textcolor{red}{4.66\%}$) 8_870_300 ($\textcolor{red}{3.37\%}$) 729_405 ($\textcolor{red}{0.01\%}$)
heap_rs 596_953 ($\textcolor{red}{10.41\%}$) 7_230_201 ($\textcolor{red}{2.56\%}$) 2_293_760 50_652 ($\textcolor{red}{4.68\%}$) 21_870 ($\textcolor{red}{1.63\%}$) 50_383 ($\textcolor{red}{4.72\%}$) 33_581_842 ($\textcolor{green}{-0.14\%}$)
heap_stable_rs 576_040 ($\textcolor{red}{10.53\%}$) 283_742_492 ($\textcolor{red}{2.12\%}$) 458_752 2_526_262 ($\textcolor{red}{5.02\%}$) 246_537 ($\textcolor{red}{1.53\%}$) 2_506_863 ($\textcolor{red}{5.01\%}$) 729_375 ($\textcolor{red}{0.00\%}$)
vec_rs 588_969 ($\textcolor{red}{9.37\%}$) 3_077_883 ($\textcolor{red}{0.02\%}$) 2_293_760 16_494 ($\textcolor{red}{4.96\%}$) 17_489 ($\textcolor{red}{5.08\%}$) 16_734 ($\textcolor{red}{4.97\%}$) 31_302_411 ($\textcolor{red}{0.00\%}$)
vec_stable_rs 572_835 ($\textcolor{red}{9.88\%}$) 63_993_021 ($\textcolor{red}{1.03\%}$) 458_752 66_549 ($\textcolor{red}{2.58\%}$) 80_266 ($\textcolor{red}{1.85\%}$) 85_639 ($\textcolor{red}{1.75\%}$) 729_377 ($\textcolor{red}{0.00\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 198_894 ($\textcolor{red}{2.69\%}$) 282_867_517 ($\textcolor{red}{5.65\%}$) 262_958_028 ($\textcolor{red}{6.10\%}$) 34_369 ($\textcolor{red}{2.18\%}$) 25_335 ($\textcolor{red}{3.27\%}$)
Rust 596_836 ($\textcolor{red}{10.14\%}$) 82_782_948 ($\textcolor{red}{0.00\%}$) 56_788_520 ($\textcolor{red}{0.00\%}$) 42_522 ($\textcolor{red}{3.60\%}$) 41_228 ($\textcolor{red}{3.20\%}$)

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 247_695 ($\textcolor{red}{1.66\%}$) 365_606_356 ($\textcolor{green}{-92.16\%}$) 342_396 ($\textcolor{green}{-90.02\%}$) 397_640 ($\textcolor{green}{-28.18\%}$) 267_761 ($\textcolor{green}{-34.36\%}$) 22_396_338 ($\textcolor{green}{-91.84\%}$)
Rust 640_537 ($\textcolor{red}{10.35\%}$) 489_666_578 ($\textcolor{green}{-92.36\%}$) 1_310_720 ($\textcolor{green}{-41.18\%}$) 660_965 ($\textcolor{green}{-35.22\%}$) 220_622 ($\textcolor{green}{-26.02\%}$) 450_827_450 ($\textcolor{green}{-92.52\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 278_516 ($\textcolor{red}{1.67\%}$) 513_188 ($\textcolor{red}{0.46\%}$) 23_336 ($\textcolor{red}{4.57\%}$) 19_241 ($\textcolor{red}{3.46\%}$) 20_459 ($\textcolor{red}{4.04\%}$) 161_567 ($\textcolor{red}{2.28\%}$)
Rust 902_362 ($\textcolor{red}{6.47\%}$) 516_247 ($\textcolor{red}{1.93\%}$) 92_673 ($\textcolor{red}{4.20\%}$) 118_753 ($\textcolor{red}{1.19\%}$) 113_669 ($\textcolor{red}{1.18\%}$) 1_499_634 ($\textcolor{red}{1.10\%}$)

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 224_643 ($\textcolor{red}{1.92\%}$) 482_025 ($\textcolor{red}{0.18\%}$) 31_104 ($\textcolor{red}{2.98\%}$) 8_880 ($\textcolor{red}{1.32\%}$) 91_835 ($\textcolor{red}{2.23\%}$)
Rust 931_779 ($\textcolor{red}{5.89\%}$) 205_310 ($\textcolor{red}{0.88\%}$) 309_520 ($\textcolor{red}{2.13\%}$) 73_609 ($\textcolor{red}{3.34\%}$) 1_635_142 ($\textcolor{red}{0.32\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 141_883 ($\textcolor{red}{3.43\%}$) 27_494 ($\textcolor{red}{40.94\%}$)
Rust 26_684 ($\textcolor{red}{12.80\%}$) 1_201 ($\textcolor{red}{150.21\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 149_709 ($\textcolor{red}{2.81\%}$) 56_158 ($\textcolor{red}{8.46\%}$) 4_695 ($\textcolor{red}{1.49\%}$)
Rust 554_248 ($\textcolor{red}{10.24\%}$) 64_790 ($\textcolor{red}{2.23\%}$) 12_216 ($\textcolor{red}{4.62\%}$)

Statistics

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_074_136_336 ($\textcolor{red}{0.56\%}$) 47_793_792 119 119 119
copying 1_074_136_218 ($\textcolor{red}{0.56\%}$) 47_793_792 1_073_873_789 ($\textcolor{red}{0.56\%}$) 1_073_954_095 ($\textcolor{red}{0.56\%}$) 1_073_875_311 ($\textcolor{red}{0.56\%}$)
compacting 1_554_238_605 ($\textcolor{red}{0.56\%}$) 47_793_792 1_200_791_965 ($\textcolor{red}{0.73\%}$) 1_424_078_246 ($\textcolor{red}{0.61\%}$) 1_447_969_756 ($\textcolor{red}{0.60\%}$)
generational 2_326_734_591 ($\textcolor{red}{0.98\%}$) 47_802_256 899_105_682 ($\textcolor{red}{1.92\%}$) 1_214_812 ($\textcolor{red}{0.30\%}$) 1_107_099 ($\textcolor{red}{0.32\%}$)
incremental 29_505_471 ($\textcolor{red}{0.01\%}$) 976_097_724 ($\textcolor{red}{0.00\%}$) 469_026_873 ($\textcolor{green}{-0.61\%}$) 496_491_319 ($\textcolor{green}{-0.20\%}$) 1_282_778_770 ($\textcolor{red}{5.03\%}$)

Actor class

binary size put new bucket put existing bucket get
Map 420_662 ($\textcolor{red}{40.59\%}$) 757_684 ($\textcolor{green}{-6.86\%}$) 16_349 ($\textcolor{red}{1.45\%}$) 16_917 ($\textcolor{red}{1.54\%}$)

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 165_434 ($\textcolor{red}{2.57\%}$) 149_754 ($\textcolor{red}{2.75\%}$) 32_863 ($\textcolor{red}{14.93\%}$) 12_200 ($\textcolor{red}{1.98\%}$) 27_064 ($\textcolor{red}{18.42\%}$) 6_622 ($\textcolor{red}{2.73\%}$)
Rust 593_655 ($\textcolor{red}{10.77\%}$) 629_046 ($\textcolor{red}{9.62\%}$) 59_348 ($\textcolor{red}{3.24\%}$) 39_106 ($\textcolor{red}{3.46\%}$) 74_039 ($\textcolor{red}{3.77\%}$) 43_504 ($\textcolor{red}{2.49\%}$)

Statistics

github-actions[bot] commented 2 months ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko. The _stable and _stable_rs suffix represents that the library directly writes the state to stable memory using Region in Motoko and ic-stable-stuctures in Rust.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.
  • The upgrade column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is that ic-stable-structures tends to be slower than the region memory in Motoko.
  • Different library has different ways for persisting data during upgrades, there are mainly three categories:
    • Use stable variable directly in Motoko: zhenya_hashmap, btree, vector
    • Expose and serialize external state (share/unshare in Motoko, candid::Encode in Rust): rbtree, heap, btreemap_rs, hashmap_rs, heap_rs, vector_rs
    • Use pre/post-upgrade hooks to convert data into an array: hashmap, splay, triemap, buffer, imrc_hashmap_rs
  • The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 193_766 8_193_819_060 56_000_256 342_788 6_469_781_020 368_431 10_766_389_355
triemap 199_308 13_670_187_584 68_228_576 252_704 657_887 648_184 15_537_338_013
rbtree 189_827 6_993_676_959 52_000_464 116_417 317_299 330_296 6_988_883_198
splay 194_386 13_052_525_320 48_000_400 625_852 657_023 920_272 4_321_903_638
btree 234_067 10_220_059_976 25_108_416 357_581 485_463 539_509 2_906_462_018
zhenya_hashmap 192_598 2_361_649_032 16_777_504 58_299 66_594 79_776 3_084_235_946
btreemap_rs 611_851 1_809_789_841 27_590_656 74_098 124_626 85_214 3_208_130_200
imrc_hashmap_rs 613_202 2_634_915_707 244_908_032 35_894 198_252 96_520 6_383_840_797
hashmap_rs 601_477 438_103_157 73_138_176 20_788 25_678 23_645 1_545_701_419

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50 upgrade
heap 170_697 5_557_564_409 24_000_360 621_758 227_293 592_698 3_240_817_053
heap_rs 596_953 143_262_451 18_284_544 58_563 21_622 58_466 647_923_463

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 177_754 2_601_290 65_652 95_575 803_545 173_575 3_146_134
vector 175_556 1_952_750 24_588 126_199 186_554 176_192 4_779_726
vec_rs 588_969 287_516 1_376_256 16_494 30_089 22_346 3_806_788

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 611_851 77_021_026 2_555_904 63_656 96_504 84_265 139_792_280
btreemap_stable_rs 616_876 4_773_834_814 2_031_616 2_893_685 5_266_123 8_870_300 729_405
heap_rs 596_953 7_230_201 2_293_760 50_652 21_870 50_383 33_581_842
heap_stable_rs 576_040 283_742_492 458_752 2_526_262 246_537 2_506_863 729_375
vec_rs 588_969 3_077_883 2_293_760 16_494 17_489 16_734 31_302_411
vec_stable_rs 572_835 63_993_021 458_752 66_549 80_266 85_639 729_377

Environment

  • dfx 0.24.0
  • Motoko compiler 0.13.0 (source dq4zmqc9-34xf70ip-6lrc3v7p-z1m6aq95)
  • rustc 1.81.0 (eeb90cda1 2024-09-04)
  • ic-repl 0.7.6
  • ic-wasm 0.9.0

    Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 198_894 282_867_517 262_958_028 34_369 25_335
Rust 596_836 82_782_948 56_788_520 42_522 41_228

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 247_695 365_606_356 342_396 397_640 267_761 22_396_338
Rust 640_537 489_666_578 1_310_720 660_965 220_622 450_827_450

Environment

  • dfx 0.24.0
  • Motoko compiler 0.13.0 (source dq4zmqc9-34xf70ip-6lrc3v7p-z1m6aq95)
  • rustc 1.81.0 (eeb90cda1 2024-09-04)
  • ic-repl 0.7.6
  • ic-wasm 0.9.0

    Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 278_516 513_188 23_336 19_241 20_459 161_567
Rust 902_362 516_247 92_673 118_753 113_669 1_499_634

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 224_643 482_025 31_104 8_880 91_835
Rust 931_779 205_310 309_520 73_609 1_635_142

Environment

  • dfx 0.24.0
  • Motoko compiler 0.13.0 (source dq4zmqc9-34xf70ip-6lrc3v7p-z1m6aq95)
  • rustc 1.81.0 (eeb90cda1 2024-09-04)
  • ic-repl 0.7.6
  • ic-wasm 0.9.0

    Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 141_883 27_494
Rust 26_684 1_201

Timer

binary_size setTimer cancelTimer
Motoko 149_709 56_158 4_695
Rust 554_248 64_790 12_216

Environment

  • dfx 0.24.0
  • Motoko compiler 0.13.0 (source dq4zmqc9-34xf70ip-6lrc3v7p-z1m6aq95)
  • rustc 1.81.0 (eeb90cda1 2024-09-04)
  • ic-repl 0.7.6
  • ic-wasm 0.9.0

    Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_074_136_336 47_793_792 119 119 119
copying 1_074_136_218 47_793_792 1_073_873_789 1_073_954_095 1_073_875_311
compacting 1_554_238_605 47_793_792 1_200_791_965 1_424_078_246 1_447_969_756
generational 2_326_734_591 47_802_256 899_105_682 1_214_812 1_107_099
incremental 29_505_471 976_097_724 469_026_873 496_491_319 1_282_778_770

Actor class

binary size put new bucket put existing bucket get
Map 420_662 757_684 16_349 16_917

Environment

  • dfx 0.24.0
  • Motoko compiler 0.13.0 (source dq4zmqc9-34xf70ip-6lrc3v7p-z1m6aq95)
  • rustc 1.81.0 (eeb90cda1 2024-09-04)
  • ic-repl 0.7.6
  • ic-wasm 0.9.0

    Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 165_434 149_754 32_863 12_200 27_064 6_622
Rust 593_655 629_046 59_348 39_106 74_039 43_504

Environment

  • dfx 0.24.0
  • Motoko compiler 0.13.0 (source dq4zmqc9-34xf70ip-6lrc3v7p-z1m6aq95)
  • rustc 1.81.0 (eeb90cda1 2024-09-04)
  • ic-repl 0.7.6
  • ic-wasm 0.9.0