dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Bump dependencies #104

Closed chenyan-dfinity closed 9 months ago

github-actions[bot] commented 9 months ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 189_608 ($\textcolor{red}{0.37\%}$) 8_370_836_596 ($\textcolor{green}{-0.00\%}$) 61_987_852 344_911 6_593_214_552 ($\textcolor{green}{-0.00\%}$) 371_223 11_026_879_813 ($\textcolor{green}{-0.00\%}$)
triemap 195_507 ($\textcolor{red}{0.35\%}$) 13_855_038_807 ($\textcolor{green}{-0.00\%}$) 74_216_172 254_589 661_468 650_820 ($\textcolor{red}{0.01\%}$) 15_817_661_582 ($\textcolor{green}{-0.00\%}$)
rbtree 186_562 ($\textcolor{red}{0.40\%}$) 7_127_318_517 ($\textcolor{red}{0.00\%}$) 57_996_060 114_300 318_352 ($\textcolor{green}{-0.01\%}$) 328_277 ($\textcolor{red}{0.01\%}$) 7_169_325_607 ($\textcolor{red}{0.00\%}$)
splay 190_525 ($\textcolor{red}{0.36\%}$) 13_247_297_529 ($\textcolor{green}{-0.00\%}$) 53_995_996 628_661 ($\textcolor{red}{0.01\%}$) 661_579 ($\textcolor{green}{-0.01\%}$) 921_933 4_567_871_575 ($\textcolor{red}{0.00\%}$)
btree 229_870 ($\textcolor{red}{0.30\%}$) 10_266_014_811 ($\textcolor{red}{0.00\%}$) 31_104_012 353_622 482_125 533_935 3_134_168_915 ($\textcolor{red}{0.00\%}$)
zhenya_hashmap 189_285 ($\textcolor{red}{0.36\%}$) 2_570_554_540 ($\textcolor{red}{0.00\%}$) 22_773_100 60_196 70_137 82_453 3_305_550_277 ($\textcolor{green}{-0.00\%}$)
btreemap_rs 537_393 ($\textcolor{red}{13.95\%}$) 1_793_333_047 ($\textcolor{green}{-0.21\%}$) 27_590_656 ($\textcolor{red}{0.24\%}$) 75_328 ($\textcolor{green}{-0.32\%}$) 125_166 ($\textcolor{green}{-0.22\%}$) 86_260 ($\textcolor{green}{-0.07\%}$) 2_937_041_107 ($\textcolor{green}{-0.14\%}$)
imrc_hashmap_rs 542_882 ($\textcolor{red}{13.78\%}$) 2_584_501_850 ($\textcolor{green}{-0.01\%}$) 244_973_568 37_762 ($\textcolor{green}{-1.70\%}$) 178_926 ($\textcolor{green}{-0.19\%}$) 115_385 ($\textcolor{green}{-0.16\%}$) 5_796_587_958 ($\textcolor{red}{0.00\%}$)
hashmap_rs 529_458 ($\textcolor{red}{13.86\%}$) 439_248_112 ($\textcolor{red}{1.26\%}$) 73_138_176 ($\textcolor{red}{0.09\%}$) 21_501 ($\textcolor{green}{-1.62\%}$) 26_711 ($\textcolor{green}{-0.91\%}$) 25_024 ($\textcolor{green}{-0.79\%}$) 1_298_646_667 ($\textcolor{red}{0.40\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50.1 upgrade
heap 167_497 ($\textcolor{red}{0.41\%}$) 5_697_880_501 ($\textcolor{red}{0.00\%}$) 29_995_956 621_338 228_674 592_198 3_309_807_456 ($\textcolor{green}{-0.00\%}$)
heap_rs 525_853 ($\textcolor{red}{14.49\%}$) 139_669_830 ($\textcolor{red}{0.72\%}$) 18_284_544 ($\textcolor{red}{0.36\%}$) 57_419 ($\textcolor{green}{-0.25\%}$) 23_051 ($\textcolor{green}{-1.26\%}$) 57_545 ($\textcolor{green}{-0.26\%}$) 510_960_192 ($\textcolor{green}{-0.20\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 173_906 ($\textcolor{red}{0.40\%}$) 2_570_947 65_644 95_490 800_452 170_490 3_061_307
vector 172_249 ($\textcolor{red}{0.40\%}$) 1_920_997 24_580 126_114 183_385 175_981 4_695_522
vec_rs 520_881 ($\textcolor{red}{14.88\%}$) 289_040 ($\textcolor{green}{-0.02\%}$) 1_376_256 ($\textcolor{red}{5.00\%}$) 17_251 ($\textcolor{green}{-0.27\%}$) 30_571 ($\textcolor{green}{-0.17\%}$) 23_331 ($\textcolor{green}{-9.42\%}$) 3_161_017 ($\textcolor{green}{-0.34\%}$)

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 537_393 ($\textcolor{red}{13.95\%}$) 76_200_333 ($\textcolor{green}{-0.25\%}$) 2_555_904 ($\textcolor{red}{2.63\%}$) 64_886 ($\textcolor{green}{-0.37\%}$) 97_044 ($\textcolor{green}{-0.29\%}$) 85_272 ($\textcolor{green}{-0.07\%}$) 126_265_270 ($\textcolor{green}{-0.16\%}$)
btreemap_stable_rs 543_609 ($\textcolor{red}{12.43\%}$) 4_561_985_735 ($\textcolor{green}{-13.20\%}$) 2_031_616 2_707_064 ($\textcolor{green}{-14.53\%}$) 5_026_642 ($\textcolor{green}{-13.10\%}$) 8_594_683 ($\textcolor{green}{-13.08\%}$) 729_311 ($\textcolor{green}{-0.25\%}$)
heap_rs 525_853 ($\textcolor{red}{14.49\%}$) 7_051_730 ($\textcolor{red}{0.71\%}$) 2_293_760 ($\textcolor{red}{2.94\%}$) 49_928 ($\textcolor{green}{-0.29\%}$) 23_299 ($\textcolor{green}{-1.25\%}$) 49_894 ($\textcolor{green}{-0.30\%}$) 26_768_703 ($\textcolor{green}{-0.19\%}$)
heap_stable_rs 506_559 ($\textcolor{red}{13.95\%}$) 271_553_517 ($\textcolor{green}{-15.04\%}$) 458_752 2_294_851 ($\textcolor{green}{-14.20\%}$) 238_596 ($\textcolor{green}{-14.29\%}$) 2_277_771 ($\textcolor{green}{-14.19\%}$) 729_317 ($\textcolor{green}{-0.27\%}$)
vec_rs 520_881 ($\textcolor{red}{14.88\%}$) 3_079_382 ($\textcolor{green}{-0.00\%}$) 2_293_760 ($\textcolor{red}{2.94\%}$) 17_251 ($\textcolor{green}{-0.27\%}$) 18_421 ($\textcolor{green}{-0.28\%}$) 17_719 ($\textcolor{green}{-1.56\%}$) 24_671_551 ($\textcolor{green}{-0.41\%}$)
vec_stable_rs 503_829 ($\textcolor{red}{14.19\%}$) 63_394_912 ($\textcolor{green}{-15.64\%}$) 458_752 62_491 ($\textcolor{green}{-12.81\%}$) 79_685 ($\textcolor{green}{-12.93\%}$) 81_633 ($\textcolor{green}{-13.99\%}$) 729_320 ($\textcolor{green}{-0.27\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 196_492 ($\textcolor{red}{0.35\%}$) 273_037_090 ($\textcolor{red}{0.00\%}$) 259_835_230 ($\textcolor{red}{0.00\%}$) 34_373 24_897
Rust 537_397 ($\textcolor{red}{12.89\%}$) 82_787_911 ($\textcolor{green}{-0.00\%}$) 56_792_991 ($\textcolor{green}{-0.00\%}$) 47_914 ($\textcolor{green}{-1.48\%}$) 50_388 ($\textcolor{green}{-1.51\%}$)

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 245_183 ($\textcolor{red}{0.28\%}$) 4_763_349_709 ($\textcolor{red}{0.00\%}$) 3_430_044 565_117 402_073 274_276_383 ($\textcolor{red}{0.00\%}$)
Rust 565_792 ($\textcolor{red}{13.57\%}$) 6_409_147_805 ($\textcolor{red}{1.58\%}$) 2_228_224 1_019_959 ($\textcolor{red}{1.57\%}$) 303_897 ($\textcolor{red}{0.85\%}$) 6_019_483_730 ($\textcolor{red}{1.57\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 275_458 ($\textcolor{red}{0.25\%}$) 510_765 ($\textcolor{red}{0.03\%}$) 22_312 ($\textcolor{green}{-0.07\%}$) 18_606 ($\textcolor{red}{0.29\%}$) 19_662 ($\textcolor{red}{0.48\%}$) 157_656 ($\textcolor{red}{0.03\%}$)
Rust 849_921 ($\textcolor{red}{9.21\%}$) 599_916 ($\textcolor{green}{-1.86\%}$) 99_156 ($\textcolor{green}{-2.19\%}$) 123_702 ($\textcolor{green}{-1.16\%}$) 136_655 ($\textcolor{green}{-0.57\%}$) 1_799_828 ($\textcolor{green}{-0.92\%}$)

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 222_607 ($\textcolor{red}{0.31\%}$) 481_158 29_810 8_776 89_459
Rust 869_104 ($\textcolor{red}{9.92\%}$) 236_542 ($\textcolor{green}{-1.15\%}$) 368_044 ($\textcolor{green}{-0.15\%}$) 91_941 ($\textcolor{red}{3.29\%}$) 1_999_207 ($\textcolor{green}{-0.14\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 137_899 ($\textcolor{red}{0.51\%}$) 19_511 ($\textcolor{green}{-15.84\%}$)
Rust 23_637 ($\textcolor{green}{-0.99\%}$) 480 ($\textcolor{green}{-59.04\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 146_290 ($\textcolor{red}{0.49\%}$) 51_655 ($\textcolor{red}{0.37\%}$) 4_610
Rust 487_585 ($\textcolor{red}{15.90\%}$) 68_173 ($\textcolor{red}{0.03\%}$) 11_184 ($\textcolor{red}{0.21\%}$)

Statistics

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_171_395_786 51_991_392 119 119 119
copying 1_171_395_668 51_991_392 1_171_104_136 1_171_195_696 1_171_106_949
compacting 1_672_114_659 51_991_392 1_290_052_154 1_533_417_828 1_564_512_242
generational 2_529_073_372 51_999_856 999_515_201 1_232_756 ($\textcolor{red}{0.00\%}$) 1_103_869 ($\textcolor{red}{0.00\%}$)
incremental 29_503_170 ($\textcolor{red}{0.00\%}$) 985_890_644 ($\textcolor{red}{0.00\%}$) 477_971_911 ($\textcolor{red}{43.21\%}$) 493_638_503 ($\textcolor{red}{46.53\%}$) 1_092_284_316 ($\textcolor{red}{224.24\%}$)

Actor class

binary size put new bucket put existing bucket get
Map 300_186 ($\textcolor{red}{0.38\%}$) 816_032 ($\textcolor{red}{0.31\%}$) 16_099 16_644

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 161_935 ($\textcolor{red}{0.42\%}$) 146_225 ($\textcolor{red}{0.47\%}$) 28_593 11_963 22_864 6_430
Rust 519_866 ($\textcolor{red}{13.85\%}$) 570_028 ($\textcolor{red}{12.26\%}$) 68_903 ($\textcolor{green}{-0.41\%}$) 42_634 ($\textcolor{green}{-0.54\%}$) 92_131 ($\textcolor{green}{-0.69\%}$) 51_818 ($\textcolor{green}{-0.57\%}$)

Statistics

github-actions[bot] commented 9 months ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko. The _stable and _stable_rs suffix represents that the library directly writes the state to stable memory using Region in Motoko and ic-stable-stuctures in Rust.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.
  • The upgrade column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is that ic-stable-structures tends to be slower than the region memory in Motoko.
  • Different library has different ways for persisting data during upgrades, there are mainly three categories:
    • Use stable variable directly in Motoko: zhenya_hashmap, btree, vector
    • Expose and serialize external state (share/unshare in Motoko, candid::Encode in Rust): rbtree, heap, btreemap_rs, hashmap_rs, heap_rs, vector_rs
    • Use pre/post-upgrade hooks to convert data into an array: hashmap, splay, triemap, buffer, imrc_hashmap_rs
  • The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
hashmap 189_608 8_370_836_596 61_987_852 344_911 6_593_214_552 371_223 11_026_879_813
triemap 195_507 13_855_038_807 74_216_172 254_589 661_468 650_820 15_817_661_582
rbtree 186_562 7_127_318_517 57_996_060 114_300 318_352 328_277 7_169_325_607
splay 190_525 13_247_297_529 53_995_996 628_661 661_579 921_933 4_567_871_575
btree 229_870 10_266_014_811 31_104_012 353_622 482_125 533_935 3_134_168_915
zhenya_hashmap 189_285 2_570_554_540 22_773_100 60_196 70_137 82_453 3_305_550_277
btreemap_rs 537_393 1_793_333_047 27_590_656 75_328 125_166 86_260 2_937_041_107
imrc_hashmap_rs 542_882 2_584_501_850 244_973_568 37_762 178_926 115_385 5_796_587_958
hashmap_rs 529_458 439_248_112 73_138_176 21_501 26_711 25_024 1_298_646_667

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50 pop_min 50 upgrade
heap 167_497 5_697_880_501 29_995_956 621_338 228_674 592_198 3_309_807_456
heap_rs 525_853 139_669_830 18_284_544 57_419 23_051 57_545 510_960_192

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500 upgrade
buffer 173_906 2_570_947 65_644 95_490 800_452 170_490 3_061_307
vector 172_249 1_920_997 24_580 126_114 183_385 175_981 4_695_522
vec_rs 520_881 289_040 1_376_256 17_251 30_571 23_331 3_161_017

Stable structures

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
btreemap_rs 537_393 76_200_333 2_555_904 64_886 97_044 85_272 126_265_270
btreemap_stable_rs 543_609 4_561_985_735 2_031_616 2_707_064 5_026_642 8_594_683 729_311
heap_rs 525_853 7_051_730 2_293_760 49_928 23_299 49_894 26_768_703
heap_stable_rs 506_559 271_553_517 458_752 2_294_851 238_596 2_277_771 729_317
vec_rs 520_881 3_079_382 2_293_760 17_251 18_421 17_719 24_671_551
vec_stable_rs 503_829 63_394_912 458_752 62_491 79_685 81_633 729_320

Environment

  • dfx 0.16.1
  • Motoko compiler 0.10.4 (source js20w7g2-ysgfrqd0-1cmy11nb-3wdy9y1k)
  • rustc 1.75.0 (82e1608df 2023-12-21)
  • ic-repl 0.6.2
  • ic-wasm 0.7.0

    Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 196_492 273_037_090 259_835_230 34_373 24_897
Rust 537_397 82_787_911 56_792_991 47_914 50_388

Certified map

binary_size generate 10k max mem inc witness upgrade
Motoko 245_183 4_763_349_709 3_430_044 565_117 402_073 274_276_383
Rust 565_792 6_409_147_805 2_228_224 1_019_959 303_897 6_019_483_730

Environment

  • dfx 0.16.1
  • Motoko compiler 0.10.4 (source js20w7g2-ysgfrqd0-1cmy11nb-3wdy9y1k)
  • rustc 1.75.0 (82e1608df 2023-12-21)
  • ic-repl 0.6.2
  • ic-wasm 0.7.0

    Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal upgrade
Motoko 275_458 510_765 22_312 18_606 19_662 157_656
Rust 849_921 599_916 99_156 123_702 136_655 1_799_828

DIP721 NFT

binary_size init mint_token transfer_token upgrade
Motoko 222_607 481_158 29_810 8_776 89_459
Rust 869_104 236_542 368_044 91_941 1_999_207

Environment

  • dfx 0.16.1
  • Motoko compiler 0.10.4 (source js20w7g2-ysgfrqd0-1cmy11nb-3wdy9y1k)
  • rustc 1.75.0 (82e1608df 2023-12-21)
  • ic-repl 0.6.2
  • ic-wasm 0.7.0

    Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 137_899 19_511
Rust 23_637 480

Timer

binary_size setTimer cancelTimer
Motoko 146_290 51_655 4_610
Rust 487_585 68_173 11_184

Environment

  • dfx 0.16.1
  • Motoko compiler 0.10.4 (source js20w7g2-ysgfrqd0-1cmy11nb-3wdy9y1k)
  • rustc 1.75.0 (82e1608df 2023-12-21)
  • ic-repl 0.6.2
  • ic-wasm 0.7.0

    Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 700k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_171_395_786 51_991_392 119 119 119
copying 1_171_395_668 51_991_392 1_171_104_136 1_171_195_696 1_171_106_949
compacting 1_672_114_659 51_991_392 1_290_052_154 1_533_417_828 1_564_512_242
generational 2_529_073_372 51_999_856 999_515_201 1_232_756 1_103_869
incremental 29_503_170 985_890_644 477_971_911 493_638_503 1_092_284_316

Actor class

binary size put new bucket put existing bucket get
Map 300_186 816_032 16_099 16_644

Environment

  • dfx 0.16.1
  • Motoko compiler 0.10.4 (source js20w7g2-ysgfrqd0-1cmy11nb-3wdy9y1k)
  • rustc 1.75.0 (82e1608df 2023-12-21)
  • ic-repl 0.6.2
  • ic-wasm 0.7.0

    Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 161_935 146_225 28_593 11_963 22_864 6_430
Rust 519_866 570_028 68_903 42_634 92_131 51_818

Environment

  • dfx 0.16.1
  • Motoko compiler 0.10.4 (source js20w7g2-ysgfrqd0-1cmy11nb-3wdy9y1k)
  • rustc 1.75.0 (82e1608df 2023-12-21)
  • ic-repl 0.6.2
  • ic-wasm 0.7.0