dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Compare cost for the new metering #74

Closed chenyan-dfinity closed 1 year ago

github-actions[bot] commented 1 year ago

Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 133_772 ($\textcolor{green}{-0.04\%}$) 9_120_083_788 ($\textcolor{red}{31.03\%}$) 61_987_732 371_774 ($\textcolor{red}{29.33\%}$) 7_046_555_134 ($\textcolor{red}{27.75\%}$) 402_302 ($\textcolor{red}{30.21\%}$)
triemap 135_185 ($\textcolor{green}{-0.10\%}$) 17_215_517_319 ($\textcolor{red}{50.60\%}$) 74_216_052 345_840 ($\textcolor{red}{55.25\%}$) 836_239 ($\textcolor{red}{52.70\%}$) 819_816 ($\textcolor{red}{52.10\%}$)
rbtree 135_861 ($\textcolor{green}{-0.19\%}$) 8_149_335_945 ($\textcolor{red}{36.29\%}$) 57_995_940 116_820 ($\textcolor{red}{31.41\%}$) 367_123 ($\textcolor{red}{36.70\%}$) 396_460 ($\textcolor{red}{42.44\%}$)
splay 131_698 ($\textcolor{green}{-0.13\%}$) 17_095_892_890 ($\textcolor{red}{47.78\%}$) 53_995_876 821_820 ($\textcolor{red}{48.90\%}$) 865_787 ($\textcolor{red}{48.85\%}$) 1_195_419 ($\textcolor{red}{47.54\%}$)
btree 176_246 ($\textcolor{green}{-0.12\%}$) 12_563_923_452 ($\textcolor{red}{52.77\%}$) 31_103_892 424_386 ($\textcolor{red}{52.91\%}$) 592_838 ($\textcolor{red}{54.32\%}$) 659_333 ($\textcolor{red}{53.68\%}$)
zhenya_hashmap 141_702 ($\textcolor{green}{-0.00\%}$) 3_623_441_658 ($\textcolor{red}{37.61\%}$) 65_987_480 92_003 ($\textcolor{red}{40.81\%}$) 114_956 ($\textcolor{red}{43.42\%}$) 137_180 ($\textcolor{red}{44.77\%}$)
btreemap_rs 419_620 ($\textcolor{red}{1.49\%}$) 1_818_061_882 ($\textcolor{red}{10.20\%}$) 13_762_560 76_267 ($\textcolor{red}{14.15\%}$) 127_354 ($\textcolor{red}{13.44\%}$) 92_934 ($\textcolor{red}{14.36\%}$)
imrc_hashmap_rs 419_303 ($\textcolor{red}{1.38\%}$) 2_582_553_741 ($\textcolor{red}{8.25\%}$) 122_454_016 39_485 ($\textcolor{red}{20.21\%}$) 179_841 ($\textcolor{red}{10.53\%}$) 122_297 ($\textcolor{red}{24.17\%}$)
hashmap_rs 413_086 ($\textcolor{red}{1.72\%}$) 437_767_476 ($\textcolor{red}{11.51\%}$) 36_536_320 22_388 ($\textcolor{red}{35.70\%}$) 27_813 ($\textcolor{red}{33.31\%}$) 25_646 ($\textcolor{red}{28.40\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 127_702 ($\textcolor{green}{-0.04\%}$) 7_251_557_847 ($\textcolor{red}{54.80\%}$) 29_995_836 797_941 ($\textcolor{red}{56.00\%}$) 294_639 ($\textcolor{red}{58.02\%}$)
heap_rs 410_722 ($\textcolor{red}{1.68\%}$) 140_171_516 ($\textcolor{red}{13.87\%}$) 9_109_504 60_210 ($\textcolor{red}{12.92\%}$) 23_974 ($\textcolor{red}{32.18\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 135_417 ($\textcolor{green}{-0.03\%}$) 3_217_931 ($\textcolor{red}{54.51\%}$) 65_508 126_168 ($\textcolor{red}{72.63\%}$) 1_012_069 ($\textcolor{red}{50.71\%}$) 199_668 ($\textcolor{red}{56.50\%}$)
vector 133_898 ($\textcolor{green}{-0.00\%}$) 2_920_678 ($\textcolor{red}{68.97\%}$) 24_764 198_390 ($\textcolor{red}{63.67\%}$) 278_928 ($\textcolor{red}{70.14\%}$) 264_366 ($\textcolor{red}{63.59\%}$)
vec_rs 409_517 ($\textcolor{red}{1.70\%}$) 290_479 ($\textcolor{red}{9.24\%}$) 655_360 17_874 ($\textcolor{red}{39.38\%}$) 31_274 ($\textcolor{red}{23.84\%}$) 26_410 ($\textcolor{red}{25.67\%}$)

Statistics

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 170_122 ($\textcolor{red}{0.01\%}$) 341_183_942 ($\textcolor{red}{29.16\%}$) 309_559_018 ($\textcolor{red}{31.67\%}$) 47_134 ($\textcolor{red}{34.12\%}$) 32_225 ($\textcolor{red}{38.60\%}$)
Rust 497_788 ($\textcolor{red}{1.41\%}$) 82_789_862 ($\textcolor{red}{0.34\%}$) 56_794_773 ($\textcolor{red}{0.48\%}$) 51_161 ($\textcolor{red}{20.67\%}$) 54_143 ($\textcolor{red}{30.16\%}$)

Certified map

binary_size generate 10k max mem inc witness
Motoko 162_071 ($\textcolor{green}{-0.21\%}$) 27_613_961_279 ($\textcolor{red}{48.62\%}$) 3_429_924 3_286_019 ($\textcolor{red}{48.74\%}$) 491_221 ($\textcolor{red}{49.87\%}$)
Rust 441_351 ($\textcolor{red}{1.73\%}$) 6_341_165_388 ($\textcolor{red}{2.16\%}$) 1_081_344 1_009_424 ($\textcolor{red}{2.50\%}$) 304_952 ($\textcolor{red}{5.58\%}$)

Statistics

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 225_908 ($\textcolor{red}{0.05\%}$) 50_463 ($\textcolor{red}{34.59\%}$) 24_244 ($\textcolor{red}{48.59\%}$) 20_204 ($\textcolor{red}{59.66\%}$) 21_570 ($\textcolor{red}{52.94\%}$)
Rust 717_720 ($\textcolor{red}{1.82\%}$) 555_159 ($\textcolor{red}{17.64\%}$) 106_187 ($\textcolor{red}{22.72\%}$) 131_091 ($\textcolor{red}{25.31\%}$) 141_461 ($\textcolor{red}{22.20\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 183_999 ($\textcolor{red}{0.06\%}$) 18_403 ($\textcolor{red}{51.08\%}$) 31_180 ($\textcolor{red}{39.70\%}$) 9_225 ($\textcolor{red}{95.86\%}$)
Rust 777_672 ($\textcolor{red}{1.43\%}$) 146_963 ($\textcolor{red}{17.54\%}$) 384_008 ($\textcolor{red}{18.34\%}$) 95_147 ($\textcolor{red}{23.38\%}$)

Statistics

Heartbeat

binary_size heartbeat
Motoko 118_910 ($\textcolor{red}{0.00\%}$) 23_725 ($\textcolor{red}{220.96\%}$)
Rust 23_573 ($\textcolor{green}{-0.53\%}$) 1_195 ($\textcolor{red}{51.07\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 125_155 ($\textcolor{green}{-0.01\%}$) 53_493 ($\textcolor{red}{251.74\%}$) 4_819 ($\textcolor{red}{187.02\%}$)
Rust 442_874 ($\textcolor{red}{1.85\%}$) 70_359 ($\textcolor{red}{61.60\%}$) 11_528 ($\textcolor{red}{50.05\%}$)

Statistics

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_491_263_035 ($\textcolor{red}{47.32\%}$) 59_396_776 123 ($\textcolor{red}{146.00\%}$) 123 ($\textcolor{red}{146.00\%}$) 123 ($\textcolor{red}{146.00\%}$)
copying 1_491_262_912 ($\textcolor{red}{47.32\%}$) 59_396_776 1_490_940_704 ($\textcolor{red}{47.29\%}$) 1_491_039_746 ($\textcolor{red}{47.29\%}$) 1_490_946_844 ($\textcolor{red}{47.29\%}$)
compacting 2_007_811_833 ($\textcolor{red}{19.87\%}$) 59_396_776 1_551_336_811 ($\textcolor{red}{19.98\%}$) 1_851_015_361 ($\textcolor{red}{20.80\%}$) 1_883_761_399 ($\textcolor{red}{20.87\%}$)
generational 3_073_273_610 ($\textcolor{red}{22.10\%}$) 59_405_240 1_200_499_985 ($\textcolor{red}{22.80\%}$) 1_296_067 ($\textcolor{red}{23.11\%}$) 1_189_105 ($\textcolor{red}{22.92\%}$)
incremental 33_437_135 ($\textcolor{red}{3.45\%}$) 1_136_153_832 355_581_986 ($\textcolor{red}{22.51\%}$) 358_878_332 ($\textcolor{red}{22.50\%}$) 358_911_360 ($\textcolor{red}{22.50\%}$)

Actor class

binary size put new bucket put existing bucket get
Map 254_149 ($\textcolor{red}{0.03\%}$) 702_838 ($\textcolor{red}{10.06\%}$) 16_754 ($\textcolor{red}{276.58\%}$) 17_269 ($\textcolor{red}{251.78\%}$)

Statistics

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 139_908 ($\textcolor{red}{0.02\%}$) 126_847 ($\textcolor{red}{0.02\%}$) 29_430 ($\textcolor{red}{101.01\%}$) 12_477 ($\textcolor{red}{47.64\%}$) 23_710 ($\textcolor{red}{125.17\%}$) 6_660 ($\textcolor{red}{81.87\%}$)
Rust 479_139 ($\textcolor{red}{1.48\%}$) 529_168 ($\textcolor{red}{1.78\%}$) 72_411 ($\textcolor{red}{40.36\%}$) 44_728 ($\textcolor{red}{29.04\%}$) 96_875 ($\textcolor{red}{30.61\%}$) 54_605 ($\textcolor{red}{31.21\%}$)

Statistics

github-actions[bot] commented 1 year ago

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 133_772 9_120_083_788 61_987_732 371_774 7_046_555_134 402_302
triemap 135_185 17_215_517_319 74_216_052 345_840 836_239 819_816
rbtree 135_861 8_149_335_945 57_995_940 116_820 367_123 396_460
splay 131_698 17_095_892_890 53_995_876 821_820 865_787 1_195_419
btree 176_246 12_563_923_452 31_103_892 424_386 592_838 659_333
zhenya_hashmap 141_702 3_623_441_658 65_987_480 92_003 114_956 137_180
btreemap_rs 419_620 1_818_061_882 13_762_560 76_267 127_354 92_934
imrc_hashmap_rs 419_303 2_582_553_741 122_454_016 39_485 179_841 122_297
hashmap_rs 413_086 437_767_476 36_536_320 22_388 27_813 25_646

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 127_702 7_251_557_847 29_995_836 797_941 294_639 760_200
heap_rs 410_722 140_171_516 9_109_504 60_210 23_974 60_423

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 135_417 3_217_931 65_508 126_168 1_012_069 199_668
vector 133_898 2_920_678 24_764 198_390 278_928 264_366
vec_rs 409_517 290_479 655_360 17_874 31_274 26_410

Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 170_122 341_183_942 309_559_018 47_134 32_225
Rust 497_788 82_789_862 56_794_773 51_161 54_143

Certified map

binary_size generate 10k max mem inc witness
Motoko 162_071 27_613_961_279 3_429_924 3_286_019 491_221
Rust 441_351 6_341_165_388 1_081_344 1_009_424 304_952

Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 225_908 50_463 24_244 20_204 21_570
Rust 717_720 555_159 106_187 131_091 141_461

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 183_999 18_403 31_180 9_225
Rust 777_672 146_963 384_008 95_147

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 118_910 23_725
Rust 23_573 1_195

Timer

binary_size setTimer cancelTimer
Motoko 125_155 53_493 4_819
Rust 442_874 70_359 11_528

Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_491_263_035 59_396_776 123 123 123
copying 1_491_262_912 59_396_776 1_490_940_704 1_491_039_746 1_490_946_844
compacting 2_007_811_833 59_396_776 1_551_336_811 1_851_015_361 1_883_761_399
generational 3_073_273_610 59_405_240 1_200_499_985 1_296_067 1_189_105
incremental 33_437_135 1_136_153_832 355_581_986 358_878_332 358_911_360

Actor class

binary size put new bucket put existing bucket get
Map 254_149 702_838 16_754 17_269

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 139_908 126_847 29_430 12_477 23_710 6_660
Rust 479_139 529_168 72_411 44_728 96_875 54_605