dfinity / canister-profiling

Collection of canister performance benchmarks
Apache License 2.0
21 stars 8 forks source link

Profile wasm-opt level 2 #36

Closed kentosugama closed 1 year ago

github-actions[bot] commented 1 year ago

Warning The flamegraph link only works after you merge.

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

Heartbeat

binary_size heartbeat
Motoko 121_920 11_354
Rust 27_253 1_080

Timer

binary_size setTimer cancelTimer
Motoko 135_637 32_210 1_748
Rust 424_735 53_825 10_153

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

💎 Takeaways

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • btree comes from Byron Becker's stable BTreeMap library.
  • zhenya_hashmap comes from Zhenya Usenko's stable HashMap library.
  • The MoVM table measures the performance of an experimental implementation of Motoko interpreter. External developers can ignore this table for now.

Map

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 162_667 2_112_036_256 9_102_052 1_124_437 613_910_550 1_065_165
triemap 166_893 2_036_258_951 9_716_008 781_608 1_868_829 1_042_928
rbtree 165_811 1_895_737_241 10_102_184 709_153 1_705_707 946_478
splay 163_998 2_105_962_355 9_302_108 1_155_004 1_976_076 1_155_170
btree 194_059 1_934_621_105 8_157_968 883_615 1_773_826 956_678
zhenya_hashmap 157_997 1_648_201_768 9_301_800 649_931 1_454_771 654_626
btreemap_rs 422_180 122_486_096 1_638_400 60_360 138_077 61_505
hashmap_rs 411_353 52_292_656 1_835_008 20_648 62_156 21_930

Priority queue

binary_size heapify 50k mem pop_min 50 put 50
heap 150_084 695_512_639 1_400_024 377_593 727_475 379_121
heap_rs 390_095 4_978_816 819_200 52_862 21_741 53_054

MoVM

binary_size generate 10k max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 162_667 422_411_986 1_820_844 1_122_416 123_656_843 1_062_614
hashmap_rs 411_353 10_772_672 950_272 19_979 61_484 20_872
imrc_hashmap_rs 420_546 19_678_328 1_572_864 31_170 118_660 37_893
movm_rs 1_704_006 1_097_944_086 2_654_208 2_716_088 6_930_281 5_411_168
movm_dynamic_rs 1_868_114 551_513_510 2_129_920 2_170_192 2_980_021 2_146_546

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 142_143 129_663 18_768 8_564 14_734 3_720
Rust 459_879 513_055 61_293 41_686 86_682 48_938

Sample Dapps

Measure the performance of some typical dapps:

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 232_614 41_548 18_372 12_873 15_115
Rust 734_143 526_210 99_189 122_212 134_838

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 185_377 12_292 23_080 4_818
Rust 775_133 142_951 370_840 92_651

Motoko Garbage Collection

Measure Motoko garbage collection cost using the Triemap benchmark. The max mem column reports rts_max_live_size after generate call. The cycle cost numbers reported here are garbage collection cost only. Some flamegraphs are truncated due to the 2M log size limit.

generate 80k max mem batch_get 50 batch_put 50 batch_remove 50
default 247_115_881 15_539_984 50 50 50
copying 247_115_831 15_539_984 247_110_319 247_262_382 247_262_501
compacting 409_365_425 15_539_984 308_339_012 348_775_445 352_663_118
generational 624_423_107 15_540_260 57_009 1_390_483 1_060_163
github-actions[bot] commented 1 year ago

Note Diffing the performance result against the published result from main branch

Heartbeat

binary_size heartbeat
Motoko 121_920 ($\textcolor{green}{-17.13\%}$) 11_354 ($\textcolor{green}{-4.89\%}$)
Rust 27_253 ($\textcolor{green}{-23.55\%}$) 1_080 ($\textcolor{red}{83.99\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 135_637 ($\textcolor{green}{-16.76\%}$) 32_210 ($\textcolor{green}{-7.01\%}$) 1_748 ($\textcolor{green}{-9.10\%}$)
Rust 424_735 ($\textcolor{green}{-19.20\%}$) 53_825 ($\textcolor{green}{-3.55\%}$) 10_153 ($\textcolor{green}{-3.68\%}$)

Map

binary_size generate 50k max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 162_667 ($\textcolor{green}{-16.84\%}$) 2_112_036_256 ($\textcolor{green}{-11.52\%}$) 9_102_052 1_124_437 ($\textcolor{green}{-13.06\%}$) 613_910_550 ($\textcolor{green}{-10.92\%}$) 1_065_165 ($\textcolor{green}{-13.06\%}$)
triemap 166_893 ($\textcolor{green}{-17.13\%}$) 2_036_258_951 ($\textcolor{green}{-11.06\%}$) 9_716_008 781_608 ($\textcolor{green}{-12.48\%}$) 1_868_829 ($\textcolor{green}{-11.65\%}$) 1_042_928 ($\textcolor{green}{-12.47\%}$)
rbtree 165_811 ($\textcolor{green}{-16.92\%}$) 1_895_737_241 ($\textcolor{green}{-10.48\%}$) 10_102_184 709_153 ($\textcolor{green}{-13.90\%}$) 1_705_707 ($\textcolor{green}{-11.04\%}$) 946_478 ($\textcolor{green}{-12.52\%}$)
splay 163_998 ($\textcolor{green}{-17.02\%}$) 2_105_962_355 ($\textcolor{green}{-10.76\%}$) 9_302_108 1_155_004 ($\textcolor{green}{-11.53\%}$) 1_976_076 ($\textcolor{green}{-11.21\%}$) 1_155_170 ($\textcolor{green}{-11.58\%}$)
btree 194_059 ($\textcolor{green}{-17.52\%}$) 1_934_621_105 ($\textcolor{green}{-10.84\%}$) 8_157_968 883_615 ($\textcolor{green}{-12.24\%}$) 1_773_826 ($\textcolor{green}{-11.31\%}$) 956_678 ($\textcolor{green}{-12.19\%}$)
zhenya_hashmap 157_997 ($\textcolor{green}{-16.46\%}$) 1_648_201_768 ($\textcolor{green}{-11.16\%}$) 9_301_800 649_931 ($\textcolor{green}{-12.91\%}$) 1_454_771 ($\textcolor{green}{-11.92\%}$) 654_626 ($\textcolor{green}{-13.02\%}$)
btreemap_rs 422_180 ($\textcolor{green}{-18.20\%}$) 122_486_096 ($\textcolor{green}{-1.06\%}$) 1_638_400 60_360 ($\textcolor{red}{1.07\%}$) 138_077 ($\textcolor{green}{-1.56\%}$) 61_505 ($\textcolor{green}{-0.94\%}$)
hashmap_rs 411_353 ($\textcolor{green}{-18.41\%}$) 52_292_656 ($\textcolor{green}{-1.77\%}$) 1_835_008 20_648 ($\textcolor{green}{-3.34\%}$) 62_156 ($\textcolor{green}{-2.57\%}$) 21_930 ($\textcolor{green}{-3.72\%}$)

Priority queue

binary_size heapify 50k mem pop_min 50 put 50
heap 150_084 ($\textcolor{green}{-16.80\%}$) 695_512_639 ($\textcolor{green}{-12.67\%}$) 1_400_024 377_593 ($\textcolor{green}{-10.27\%}$) 727_475 ($\textcolor{green}{-12.82\%}$)
heap_rs 390_095 ($\textcolor{green}{-17.90\%}$) 4_978_816 ($\textcolor{green}{-1.25\%}$) 819_200 52_862 ($\textcolor{green}{-1.31\%}$) 21_741 ($\textcolor{green}{-2.42\%}$)

MoVM

binary_size generate 10k max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 162_667 ($\textcolor{green}{-16.84\%}$) 422_411_986 ($\textcolor{green}{-11.53\%}$) 1_820_844 1_122_416 ($\textcolor{green}{-13.09\%}$) 123_656_843 ($\textcolor{green}{-10.96\%}$) 1_062_614 ($\textcolor{green}{-13.08\%}$)
hashmap_rs 411_353 ($\textcolor{green}{-18.41\%}$) 10_772_672 ($\textcolor{green}{-1.75\%}$) 950_272 19_979 ($\textcolor{green}{-3.37\%}$) 61_484 ($\textcolor{green}{-2.56\%}$) 20_872 ($\textcolor{green}{-3.67\%}$)
imrc_hashmap_rs 420_546 ($\textcolor{green}{-18.59\%}$) 19_678_328 ($\textcolor{green}{-0.92\%}$) 1_572_864 31_170 ($\textcolor{green}{-2.04\%}$) 118_660 ($\textcolor{green}{-1.29\%}$) 37_893 ($\textcolor{green}{-0.07\%}$)
movm_rs 1_704_006 ($\textcolor{green}{-16.27\%}$) 1_097_944_086 ($\textcolor{green}{-0.08\%}$) 2_654_208 2_716_088 ($\textcolor{green}{-1.02\%}$) 6_930_281 ($\textcolor{green}{-0.19\%}$) 5_411_168 ($\textcolor{green}{-0.10\%}$)
movm_dynamic_rs 1_868_114 ($\textcolor{green}{-17.04\%}$) 551_513_510 ($\textcolor{green}{-0.73\%}$) 2_129_920 2_170_192 ($\textcolor{green}{-0.76\%}$) 2_980_021 ($\textcolor{green}{-1.00\%}$) 2_146_546 ($\textcolor{green}{-0.90\%}$)

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 142_143 ($\textcolor{green}{-17.24\%}$) 129_663 ($\textcolor{green}{-17.36\%}$) 18_768 ($\textcolor{green}{-4.45\%}$) 8_564 ($\textcolor{green}{-6.35\%}$) 14_734 ($\textcolor{green}{-5.16\%}$) 3_720 ($\textcolor{green}{-7.02\%}$)
Rust 459_879 ($\textcolor{green}{-18.46\%}$) 513_055 ($\textcolor{green}{-26.31\%}$) 61_293 ($\textcolor{green}{-3.33\%}$) 41_686 ($\textcolor{green}{-3.48\%}$) 86_682 ($\textcolor{green}{-3.02\%}$) 48_938 ($\textcolor{green}{-3.18\%}$)

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 232_614 ($\textcolor{green}{-20.08\%}$) 41_548 ($\textcolor{green}{-6.94\%}$) 18_372 ($\textcolor{green}{-8.50\%}$) 12_873 ($\textcolor{green}{-9.09\%}$) 15_115 ($\textcolor{green}{-10.15\%}$)
Rust 734_143 ($\textcolor{green}{-22.27\%}$) 526_210 ($\textcolor{green}{-2.91\%}$) 99_189 ($\textcolor{green}{-3.20\%}$) 122_212 ($\textcolor{green}{-2.91\%}$) 134_838 ($\textcolor{green}{-2.86\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 185_377 ($\textcolor{green}{-21.20\%}$) 12_292 ($\textcolor{green}{-8.12\%}$) 23_080 ($\textcolor{green}{-6.48\%}$) 4_818 ($\textcolor{green}{-10.06\%}$)
Rust 775_133 ($\textcolor{green}{-22.34\%}$) 142_951 ($\textcolor{green}{-2.85\%}$) 370_840 ($\textcolor{green}{-2.72\%}$) 92_651 ($\textcolor{green}{-2.95\%}$)

Motoko Garbage Collection

generate 80k max mem batch_get 50 batch_put 50 batch_remove 50
default 247_115_881 15_539_984 50 50 50
copying 247_115_831 15_539_984 247_110_319 247_262_382 247_262_501
compacting 409_365_425 15_539_984 308_339_012 348_775_445 352_663_118
generational 624_423_107 15_540_260 57_009 1_390_483 1_060_163