dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Remove wasm-opt #79

Closed chenyan-dfinity closed 11 months ago

github-actions[bot] commented 11 months ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 159_194 ($\textcolor{red}{15.13\%}$) 7_588_838_102 ($\textcolor{red}{8.82\%}$) 61_987_732 318_951 ($\textcolor{red}{10.67\%}$) 6_008_157_311 ($\textcolor{red}{8.69\%}$) 341_403 ($\textcolor{red}{10.23\%}$)
triemap 162_066 ($\textcolor{red}{15.96\%}$) 12_638_993_478 ($\textcolor{red}{10.56\%}$) 74_216_052 250_401 ($\textcolor{red}{12.38\%}$) 611_116 ($\textcolor{red}{11.58\%}$) 600_960 ($\textcolor{red}{11.48\%}$)
rbtree 162_408 ($\textcolor{red}{15.54\%}$) 6_161_146_247 ($\textcolor{red}{3.04\%}$) 57_995_940 116_202 ($\textcolor{red}{30.70\%}$) 277_258 ($\textcolor{red}{3.23\%}$) 302_170 ($\textcolor{red}{8.56\%}$)
splay 157_707 ($\textcolor{red}{15.67\%}$) 12_285_354_629 ($\textcolor{red}{6.20\%}$) 53_995_876 589_148 ($\textcolor{red}{6.74\%}$) 619_660 ($\textcolor{red}{6.53\%}$) 870_643 ($\textcolor{red}{7.46\%}$)
btree 214_489 ($\textcolor{red}{18.21\%}$) 9_139_742_724 ($\textcolor{red}{11.13\%}$) 31_103_892 314_208 ($\textcolor{red}{13.21\%}$) 429_672 ($\textcolor{red}{11.84\%}$) 480_823 ($\textcolor{red}{12.07\%}$)
zhenya_hashmap 168_445 ($\textcolor{red}{14.99\%}$) 2_822_042_495 ($\textcolor{red}{7.13\%}$) 65_987_480 73_056 ($\textcolor{red}{11.71\%}$) 89_710 ($\textcolor{red}{11.85\%}$) 108_471 ($\textcolor{red}{14.39\%}$)
btreemap_rs 446_200 ($\textcolor{red}{6.22\%}$) 1_637_381_286 ($\textcolor{green}{-1.01\%}$) 13_762_560 65_726 ($\textcolor{green}{-1.74\%}$) 112_100 ($\textcolor{green}{-0.41\%}$) 81_944 ($\textcolor{red}{0.78\%}$)
imrc_hashmap_rs 446_100 ($\textcolor{red}{6.28\%}$) 2_388_802_442 ($\textcolor{red}{0.10\%}$) 122_454_016 33_200 ($\textcolor{red}{0.90\%}$) 163_343 ($\textcolor{red}{0.32\%}$) 97_441 ($\textcolor{green}{-1.10\%}$)
hashmap_rs 439_281 ($\textcolor{red}{6.23\%}$) 414_097_976 ($\textcolor{red}{2.93\%}$) 36_536_320 17_125 ($\textcolor{red}{2.56\%}$) 22_229 ($\textcolor{red}{2.91\%}$) 20_323 ($\textcolor{red}{1.35\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 152_465 ($\textcolor{red}{15.31\%}$) 5_200_975_509 ($\textcolor{red}{11.02\%}$) 29_995_836 574_107 ($\textcolor{red}{12.24\%}$) 208_978 ($\textcolor{red}{12.07\%}$)
heap_rs 437_212 ($\textcolor{red}{6.33\%}$) 125_346_655 ($\textcolor{red}{1.82\%}$) 9_109_504 53_855 ($\textcolor{red}{0.89\%}$) 18_525 ($\textcolor{red}{1.77\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 161_556 ($\textcolor{red}{15.47\%}$) 2_296_568 ($\textcolor{red}{10.27\%}$) 65_508 84_247 ($\textcolor{red}{15.26\%}$) 740_528 ($\textcolor{red}{10.28\%}$) 142_247 ($\textcolor{red}{11.49\%}$)
vector 160_836 ($\textcolor{red}{16.26\%}$) 1_890_122 ($\textcolor{red}{9.35\%}$) 24_764 137_383 ($\textcolor{red}{13.33\%}$) 179_360 ($\textcolor{red}{9.40\%}$) 181_415 ($\textcolor{red}{12.26\%}$)
vec_rs 435_769 ($\textcolor{red}{6.30\%}$) 266_276 ($\textcolor{red}{0.16\%}$) 655_360 13_221 ($\textcolor{red}{2.47\%}$) 25_648 ($\textcolor{red}{1.25\%}$) 20_782 ($\textcolor{green}{-2.04\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 196_200 ($\textcolor{red}{12.37\%}$) 285_638_950 ($\textcolor{red}{8.13\%}$) 255_658_597 ($\textcolor{red}{8.74\%}$) 34_744 ($\textcolor{green}{-1.16\%}$) 23_689 ($\textcolor{red}{1.84\%}$)
Rust 528_163 ($\textcolor{red}{6.00\%}$) 82_512_868 ($\textcolor{red}{0.00\%}$) 56_526_741 ($\textcolor{red}{0.00\%}$) 43_318 ($\textcolor{red}{1.80\%}$) 45_477 ($\textcolor{red}{2.03\%}$)

Certified map

binary_size generate 10k max mem inc witness
Motoko 205_806 ($\textcolor{red}{15.44\%}$) 5_050_097_115 ($\textcolor{red}{0.73\%}$) 3_429_924 598_393 ($\textcolor{red}{0.71\%}$) 361_240 ($\textcolor{red}{10.21\%}$)
Rust 469_882 ($\textcolor{red}{6.36\%}$) 6_232_657_596 ($\textcolor{red}{0.49\%}$) 1_081_344 988_880 ($\textcolor{red}{0.51\%}$) 290_927 ($\textcolor{red}{0.85\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 277_584 ($\textcolor{red}{20.56\%}$) 39_684 ($\textcolor{red}{5.57\%}$) 17_787 ($\textcolor{red}{8.86\%}$) 13_881 ($\textcolor{red}{9.54\%}$) 15_641 ($\textcolor{red}{10.82\%}$)
Rust 762_892 ($\textcolor{red}{6.20\%}$) 484_210 ($\textcolor{red}{2.50\%}$) 89_052 ($\textcolor{red}{2.61\%}$) 107_757 ($\textcolor{red}{2.37\%}$) 118_705 ($\textcolor{red}{2.13\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 229_941 ($\textcolor{red}{22.10\%}$) 13_217 ($\textcolor{red}{7.74\%}$) 23_778 ($\textcolor{red}{6.36\%}$) 5_267 ($\textcolor{red}{11.38\%}$)
Rust 828_098 ($\textcolor{red}{6.40\%}$) 128_336 ($\textcolor{red}{2.43\%}$) 332_435 ($\textcolor{red}{2.28\%}$) 79_221 ($\textcolor{red}{2.22\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 142_110 ($\textcolor{red}{15.20\%}$) 7_777 ($\textcolor{red}{106.95\%}$)
Rust 25_646 ($\textcolor{red}{8.55\%}$) 789 ($\textcolor{red}{68.23\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 149_306 ($\textcolor{red}{15.17\%}$) 16_471 ($\textcolor{red}{8.17\%}$) 1_847 ($\textcolor{red}{9.68\%}$)
Rust 470_589 ($\textcolor{red}{6.14\%}$) 44_509 ($\textcolor{red}{2.52\%}$) 7_713 ($\textcolor{red}{2.88\%}$)

Statistics

Garbage Collection

Note Same as main branch, skipping.

Actor class

binary size put new bucket put existing bucket get
Map 297_686 ($\textcolor{red}{13.79\%}$) 719_907 ($\textcolor{red}{9.73\%}$) 4_854 ($\textcolor{red}{8.86\%}$) 5_288 ($\textcolor{red}{7.50\%}$)

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 166_785 ($\textcolor{red}{15.47\%}$) 151_855 ($\textcolor{red}{15.66\%}$) 15_317 ($\textcolor{red}{4.55\%}$) 8_758 ($\textcolor{red}{3.57\%}$) 11_042 ($\textcolor{red}{4.77\%}$) 3_872 ($\textcolor{red}{5.53\%}$)
Rust 511_787 ($\textcolor{red}{6.71\%}$) 565_325 ($\textcolor{red}{6.73\%}$) 52_965 ($\textcolor{red}{2.39\%}$) 35_499 ($\textcolor{red}{2.62\%}$) 76_563 ($\textcolor{red}{2.30\%}$) 45_397 ($\textcolor{red}{2.34\%}$)

Statistics

github-actions[bot] commented 11 months ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 159_194 7_588_838_102 61_987_732 318_951 6_008_157_311 341_403
triemap 162_066 12_638_993_478 74_216_052 250_401 611_116 600_960
rbtree 162_408 6_161_146_247 57_995_940 116_202 277_258 302_170
splay 157_707 12_285_354_629 53_995_876 589_148 619_660 870_643
btree 214_489 9_139_742_724 31_103_892 314_208 429_672 480_823
zhenya_hashmap 168_445 2_822_042_495 65_987_480 73_056 89_710 108_471
btreemap_rs 446_200 1_637_381_286 13_762_560 65_726 112_100 81_944
imrc_hashmap_rs 446_100 2_388_802_442 122_454_016 33_200 163_343 97_441
hashmap_rs 439_281 414_097_976 36_536_320 17_125 22_229 20_323

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 152_465 5_200_975_509 29_995_836 574_107 208_978 546_783
heap_rs 437_212 125_346_655 9_109_504 53_855 18_525 54_078

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 161_556 2_296_568 65_508 84_247 740_528 142_247
vector 160_836 1_890_122 24_764 137_383 179_360 181_415
vec_rs 435_769 266_276 655_360 13_221 25_648 20_782

Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 196_200 285_638_950 255_658_597 34_744 23_689
Rust 528_163 82_512_868 56_526_741 43_318 45_477

Certified map

binary_size generate 10k max mem inc witness
Motoko 205_806 5_050_097_115 3_429_924 598_393 361_240
Rust 469_882 6_232_657_596 1_081_344 988_880 290_927

Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 277_584 39_684 17_787 13_881 15_641
Rust 762_892 484_210 89_052 107_757 118_705

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 229_941 13_217 23_778 5_267
Rust 828_098 128_336 332_435 79_221

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 142_110 7_777
Rust 25_646 789

Timer

binary_size setTimer cancelTimer
Motoko 149_306 16_471 1_847
Rust 470_589 44_509 7_713

Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_012_258_537 59_396_776 50 50 50
copying 1_012_258_487 59_396_776 1_012_236_046 1_012_303_056 1_012_240_283
compacting 1_675_009_925 59_396_776 1_292_955_500 1_532_273_641 1_558_502_986
generational 2_498_146_508 59_405_240 977_578_983 1_044_991 960_405
incremental 32_320_754 1_136_155_048 290_257_785 292_951_006 292_977_552

Actor class

binary size put new bucket put existing bucket get
Map 297_686 719_907 4_854 5_288

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 166_785 151_855 15_317 8_758 11_042 3_872
Rust 511_787 565_325 52_965 35_499 76_563 45_397