Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.

Warning Skip table 0 ## Map from _out/collections/README.md, due to table shape mismatches from main branch.

Priority queue

	binary_size	heapify 1m	max mem	pop_min 50	put 50	pop_min 50.1	upgrade
heap	166_988 ($\textcolor{red}{0.05\%}$)	5_554_617_018	24_000_360	621_690	227_224	592_588	3_189_831_485
heap_rs	576_607 ($\textcolor{red}{6.65\%}$)	144_262_092 ($\textcolor{red}{3.29\%}$)	18_284_544	60_983 ($\textcolor{red}{9.13\%}$)	21_544 ($\textcolor{red}{1.28\%}$)	60_888 ($\textcolor{red}{9.19\%}$)	642_924_555 ($\textcolor{green}{-0.92\%}$)

Growable array

	binary_size	generate 5k	max mem	batch_get 500	batch_put 500	batch_remove 500	upgrade
buffer	173_988 ($\textcolor{red}{0.05\%}$)	2_601_059	65_644	95_506	803_474	173_506	3_091_310
vector	172_017 ($\textcolor{red}{0.05\%}$)	1_952_689	24_580	126_130	186_485	176_123	4_675_192
vec_rs	574_751 ($\textcolor{red}{6.73\%}$)	287_152 ($\textcolor{red}{0.13\%}$)	1_376_256	15_975 ($\textcolor{red}{1.66\%}$)	29_064 ($\textcolor{red}{0.94\%}$)	21_824 ($\textcolor{red}{1.26\%}$)	3_717_375 ($\textcolor{green}{-2.32\%}$)

Stable structures

	binary_size	generate 50k	max mem	batch_get 50	batch_put 50	batch_remove 50	upgrade
btreemap_rs	592_800 ($\textcolor{red}{6.63\%}$)	76_934_678 ($\textcolor{red}{1.02\%}$)	2_555_904	63_128 ($\textcolor{red}{0.44\%}$)	95_879 ($\textcolor{red}{0.80\%}$)	83_525 ($\textcolor{green}{-0.62\%}$)	138_195_760 ($\textcolor{green}{-1.00\%}$)
btreemap_stable_rs	599_653 ($\textcolor{red}{7.45\%}$)	4_737_161_366 ($\textcolor{red}{3.79\%}$)	2_031_616	2_865_151 ($\textcolor{red}{5.74\%}$)	5_225_233 ($\textcolor{red}{3.85\%}$)	8_806_609 ($\textcolor{red}{2.63\%}$)	729_745 ($\textcolor{red}{0.05\%}$)
heap_rs	576_607 ($\textcolor{red}{6.65\%}$)	7_279_802 ($\textcolor{red}{3.27\%}$)	2_293_760	52_442 ($\textcolor{red}{8.38\%}$)	21_792 ($\textcolor{red}{1.27\%}$)	52_157 ($\textcolor{red}{8.40\%}$)	33_332_862 ($\textcolor{green}{-0.88\%}$)
heap_stable_rs	560_468 ($\textcolor{red}{7.54\%}$)	281_770_028 ($\textcolor{red}{1.41\%}$)	458_752	2_447_410 ($\textcolor{red}{1.74\%}$)	246_056 ($\textcolor{red}{1.33\%}$)	2_428_931 ($\textcolor{red}{1.74\%}$)	729_696 ($\textcolor{red}{0.05\%}$)
vec_rs	574_751 ($\textcolor{red}{6.73\%}$)	3_077_562 ($\textcolor{red}{0.01\%}$)	2_293_760	15_975 ($\textcolor{red}{1.66\%}$)	16_914 ($\textcolor{red}{1.63\%}$)	16_212 ($\textcolor{red}{1.70\%}$)	30_403_059 ($\textcolor{green}{-2.87\%}$)
vec_stable_rs	553_418 ($\textcolor{red}{6.15\%}$)	60_342_671 ($\textcolor{green}{-4.74\%}$)	458_752	66_073 ($\textcolor{red}{1.85\%}$)	76_095 ($\textcolor{green}{-3.44\%}$)	83_515 ($\textcolor{green}{-0.78\%}$)	729_687 ($\textcolor{red}{0.05\%}$)

Statistics

binary_size: 4.97% [3.23%, 6.71%]
max_mem: no change
cycles: 1.55% [0.71%, 2.39%]

SHA-2

	binary_size	SHA-256	SHA-512	account_id	neuron_id
Motoko	193_771 ($\textcolor{red}{0.04\%}$)	267_743_355	247_834_501	33_636	24_532
Rust	575_720 ($\textcolor{red}{6.24\%}$)	82_782_549 ($\textcolor{red}{0.00\%}$)	56_788_101 ($\textcolor{red}{0.00\%}$)	42_415 ($\textcolor{red}{3.34\%}$)	40_810 ($\textcolor{red}{2.16\%}$)

Certified map

	binary_size	generate 10k	max mem	inc	witness	upgrade
Motoko	243_726 ($\textcolor{red}{0.03\%}$)	4_666_119_661	3_430_044	553_629	407_936	274_434_719
Rust	619_546 ($\textcolor{red}{6.73\%}$)	6_409_499_600 ($\textcolor{red}{0.00\%}$)	2_228_224	1_020_428 ($\textcolor{red}{0.01\%}$)	299_393 ($\textcolor{red}{0.40\%}$)	6_025_596_995 ($\textcolor{green}{-0.01\%}$)

Statistics

binary_size: 3.26% [-1.12%, 7.65%]
max_mem: no change
cycles: 0.74% [-0.13%, 1.60%]

Basic DAO

	binary_size	init	transfer_token	submit_proposal	vote_proposal	upgrade
Motoko	274_023 ($\textcolor{red}{0.03\%}$)	510_825	22_316	18_599 ($\textcolor{red}{0.01\%}$)	19_635 ($\textcolor{green}{-0.15\%}$)	157_946 ($\textcolor{green}{-0.01\%}$)
Rust	888_398 ($\textcolor{red}{4.82\%}$)	515_016 ($\textcolor{red}{1.68\%}$)	91_701 ($\textcolor{red}{3.11\%}$)	117_771 ($\textcolor{red}{0.35\%}$)	112_893 ($\textcolor{red}{0.49\%}$)	1_490_782 ($\textcolor{red}{0.50\%}$)

DIP721 NFT

	binary_size	init	mint_token	transfer_token	upgrade
Motoko	220_488 ($\textcolor{red}{0.04\%}$)	481_158	30_204	8_764	89_833
Rust	920_591 ($\textcolor{red}{4.62\%}$)	203_585 ($\textcolor{red}{0.04\%}$)	302_804 ($\textcolor{green}{-0.08\%}$)	68_166 ($\textcolor{green}{-4.31\%}$)	1_617_697 ($\textcolor{green}{-0.75\%}$)

Statistics

binary_size: 2.38% [-0.81%, 5.56%]
max_mem: no change
cycles: 0.07% [-0.81%, 0.96%]

Heartbeat

	binary_size	heartbeat
Motoko	137_268 ($\textcolor{red}{0.06\%}$)	19_507
Rust	26_044 ($\textcolor{red}{10.09\%}$)	1_191 ($\textcolor{red}{148.12\%}$)

Timer

	binary_size	setTimer	cancelTimer
Motoko	145_704 ($\textcolor{red}{0.06\%}$)	51_778	4_626
Rust	535_525 ($\textcolor{red}{6.52\%}$)	64_193 ($\textcolor{red}{1.28\%}$)	11_797 ($\textcolor{red}{1.04\%}$)

Statistics

binary_size: 3.29% [-17.10%, 23.68%]
max_mem: no change
cycles: 1.16% [0.38%, 1.94%]

Garbage Collection

Note Same as main branch, skipping.

Actor class

	binary size	put new bucket	put existing bucket	get
Map	299_302 ($\textcolor{red}{0.03\%}$)	813_521	16_115	16_660

Statistics

binary_size: no change
max_mem: no change
cycles: 0.03%

Publisher & Subscriber

	pub_binary_size	sub_binary_size	subscribe_caller	subscribe_callee	publish_caller	publish_callee
Motoko	161_375 ($\textcolor{red}{0.05\%}$)	145_826 ($\textcolor{red}{0.06\%}$)	28_593	11_963	22_854	6_446
Rust	570_196 ($\textcolor{red}{6.39\%}$)	606_522 ($\textcolor{red}{5.70\%}$)	58_150 ($\textcolor{red}{1.15\%}$)	38_553 ($\textcolor{red}{2.00\%}$)	72_692 ($\textcolor{red}{1.88\%}$)	42_681 ($\textcolor{red}{0.55\%}$)

Statistics

binary_size: 3.05% [-1.03%, 7.13%]
max_mem: no change
cycles: 1.39% [0.60%, 2.19%]

Overall Statistics
binary_size: 3.84% [2.74%, 4.94%]
max_mem: no change
cycles: 1.15% [0.61%, 1.68%]

Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust. The library names with _rs suffix are written in Rust; the rest are written in Motoko. The _stable and _stable_rs suffix represents that the library directly writes the state to stable memory using Region in Motoko and ic-stable-stuctures in Rust.

We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

generate 1m. Insert 1m Nat64 integers into the collection. For Motoko collections, it usually triggers the GC; the rest of the column are not likely to trigger GC.
max mem. For Motoko, it reports rts_max_heap_size after generate call; For Rust, it reports the Wasm's memory page * 32Kb.
batch_get 50. Find 50 elements from the collection.
batch_put 50. Insert 50 elements to the collection.
batch_remove 50. Remove 50 elements from the collection.
upgrade. Upgrade the canister with the same Wasm module. For non-stable benchmarks, the map state is persisted by serializing and deserializing states into stable memory. For stable benchmarks, the upgrade takes no cycles, as the state is already in the stable memory.

💎 Takeaways

The platform only charges for instruction count. Data structures which make use of caching and locality have no impact on the cost.
We have a limit on the maximal cycles per round. This means asymptotic behavior doesn't matter much. We care more about the performance up to a fixed N. In the extreme cases, you may see an $O(10000 n\log n)$ algorithm hitting the limit, while an $O(n^2)$ algorithm runs just fine.
Amortized algorithms/GC may need to be more eager to avoid hitting the cycle limit on a particular round.
Rust costs more cycles to process complicated Candid data, but it is more efficient in performing core computations.

Note

The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.

Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.

The upgrade column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is that ic-stable-structures tends to be slower than the region memory in Motoko.

Different library has different ways for persisting data during upgrades, there are mainly three categories:

Use stable variable directly in Motoko: zhenya_hashmap, btree, vector

Expose and serialize external state (share/unshare in Motoko, candid::Encode in Rust): rbtree, heap, btreemap_rs, hashmap_rs, heap_rs, vector_rs

Use pre/post-upgrade hooks to convert data into an array: hashmap, triemap, buffer, imrc_hashmap_rs

The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.

hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.

btree comes from mops.one/stableheapbtreemap.

zhenya_hashmap comes from mops.one/map.

vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.

hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.

imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

	binary_size	generate 1m	max mem	batch_get 50	batch_put 50	batch_remove 50	upgrade
hashmap	190_014	8_184_618_025	56_000_256	342_784	6_462_528_122	368_420	10_728_193_099
triemap	195_557	13_661_315_924	68_228_576	252_649	657_794	648_084	15_499_470_884
rbtree	185_992	7_009_043_570	52_000_464	116_348	318_320	330_226	6_870_900_152
btree	230_406	10_223_929_607	25_108_416	357_912	485_794	539_490	2_861_974_825
zhenya_hashmap	188_979	2_360_638_679	16_777_504	58_204	66_552	79_675	3_018_208_083
btreemap_rs	592_800	1_808_073_617	27_590_656	73_570	124_009	84_481	3_176_207_795
imrc_hashmap_rs	593_795	2_577_499_644	244_908_032	35_385	194_881	92_505	6_253_420_470
hashmap_rs	581_570	435_576_576	73_138_176	20_263	25_195	23_506	1_521_942_969

Priority queue

	binary_size	heapify 1m	max mem	pop_min 50	put 50	pop_min 50	upgrade
heap	166_988	5_554_617_018	24_000_360	621_690	227_224	592_588	3_189_831_485
heap_rs	576_607	144_262_092	18_284_544	60_983	21_544	60_888	642_924_555

Growable array

	binary_size	generate 5k	max mem	batch_get 500	batch_put 500	batch_remove 500	upgrade
buffer	173_988	2_601_059	65_644	95_506	803_474	173_506	3_091_310
vector	172_017	1_952_689	24_580	126_130	186_485	176_123	4_675_192
vec_rs	574_751	287_152	1_376_256	15_975	29_064	21_824	3_717_375

Stable structures

	binary_size	generate 50k	max mem	batch_get 50	batch_put 50	batch_remove 50	upgrade
btreemap_rs	592_800	76_934_678	2_555_904	63_128	95_879	83_525	138_195_760
btreemap_stable_rs	599_653	4_737_161_366	2_031_616	2_865_151	5_225_233	8_806_609	729_745
heap_rs	576_607	7_279_802	2_293_760	52_442	21_792	52_157	33_332_862
heap_stable_rs	560_468	281_770_028	458_752	2_447_410	246_056	2_428_931	729_696
vec_rs	574_751	3_077_562	2_293_760	15_975	16_914	16_212	30_403_059
vec_stable_rs	553_418	60_342_671	458_752	66_073	76_095	83_515	729_687

Environment

dfx 0.20.2-beta.0

Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)

rustc 1.78.0 (9b00956e5 2024-04-29)

ic-repl 0.7.4

ic-wasm 0.7.1
Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

SHA-2 benchmarks
- SHA-256/SHA-512. Compute the hash of a 1M Wasm binary.
- account_id. Compute the ledger account id from principal, based on SHA-224.
- neuron_id. Compute the NNS neuron id from principal, based on SHA-256.
Certified map. Merkle Tree for storing key-value pairs and generate witness according to the IC Interface Specification.
- generate 10k. Insert 10k 7-character word as both key and value into the certified map.
- max mem. For Motoko, it reports rts_max_heap_size after generate call; For Rust, it reports the Wasm's memory page * 32Kb.
- inc. Increment a counter and insert the counter value into the map.
- witness. Generate the root hash and a witness for the counter.
- upgrade. Upgrade the canister with the same Wasm. In Motoko, we use stable variable. In Rust, we convert the tree to a vector before serialization.

SHA-2

	binary_size	SHA-256	SHA-512	account_id	neuron_id
Motoko	193_771	267_743_355	247_834_501	33_636	24_532
Rust	575_720	82_782_549	56_788_101	42_415	40_810

Certified map

	binary_size	generate 10k	max mem	inc	witness	upgrade
Motoko	243_726	4_666_119_661	3_430_044	553_629	407_936	274_434_719
Rust	619_546	6_409_499_600	2_228_224	1_020_428	299_393	6_025_596_995

Environment

dfx 0.20.2-beta.0

Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)

rustc 1.78.0 (9b00956e5 2024-04-29)

ic-repl 0.7.4

ic-wasm 0.7.1
Sample Dapps

Measure the performance of some typical dapps:

Basic DAO, with heartbeat disabled to make profiling easier. We have a separate benchmark to measure heartbeat performance.
DIP721 NFT

Note

The cost difference is mainly due to the Candid serialization cost.

Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.

We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.

For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

	binary_size	init	transfer_token	submit_proposal	vote_proposal	upgrade
Motoko	274_023	510_825	22_316	18_599	19_635	157_946
Rust	888_398	515_016	91_701	117_771	112_893	1_490_782

DIP721 NFT

	binary_size	init	mint_token	transfer_token	upgrade
Motoko	220_488	481_158	30_204	8_764	89_833
Rust	920_591	203_585	302_804	68_166	1_617_697

Environment

dfx 0.20.2-beta.0

Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)

rustc 1.78.0 (9b00956e5 2024-04-29)

ic-repl 0.7.4

ic-wasm 0.7.1
Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

setTimer measures both the setTimer(0) method and the execution of empty job.
It is not easy to reliably capture the above events in one flamegraph, as the implementation detail of the replica can affect how we measure this. Typically, a correct flamegraph contains both setTimer and canister_global_timer function. If it's not there, we may need to adjust the script.

Heartbeat

	binary_size	heartbeat
Motoko	137_268	19_507
Rust	26_044	1_191

Timer

	binary_size	setTimer	cancelTimer
Motoko	145_704	51_778	4_626
Rust	535_525	64_193	11_797

Environment

dfx 0.20.2-beta.0

Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)

rustc 1.78.0 (9b00956e5 2024-04-29)

ic-repl 0.7.4

ic-wasm 0.7.1
Motoko Specific Benchmarks

Measure various features only available in Motoko.

Garbage Collection. Measure Motoko garbage collection cost using the Triemap benchmark. The max mem column reports rts_max_heap_size after generate call. The cycle cost numbers reported here are garbage collection cost only. Some flamegraphs are truncated due to the 2M log size limit. The dfx/ic-wasm optimizer is disabled for the garbage collection test cases due to how the optimizer affects function names, making profiling trickier.
- default. Compile with the default GC option. With the current GC scheduler, generate will trigger the copying GC. The rest of the methods will not trigger GC.
- copying. Compile with --force-gc --copying-gc.
- compacting. Compile with --force-gc --compacting-gc.
- generational. Compile with --force-gc --generational-gc.
- incremental. Compile with --force-gc --incremental-gc.
Actor class. Measure the cost of spawning actor class, using the Actor classes example.

Garbage Collection

	generate 700k	max mem	batch_get 50	batch_put 50	batch_remove 50
default	1_068_192_695	47_793_792	119	119	119
copying	1_068_192_577	47_793_792	1_067_924_316	1_068_004_203	1_067_925_853
compacting	1_545_586_176	47_793_792	1_192_139_528	1_415_425_189	1_439_317_325
generational	2_304_140_531	47_802_256	882_208_645	1_211_144	1_103_549
incremental	29_503_170	976_097_188	471_911_803	497_465_467	1_221_308_722

Actor class

	binary size	put new bucket	put existing bucket	get
Map	299_302	813_521	16_115	16_660

Environment

dfx 0.20.2-beta.0

Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)

rustc 1.78.0 (9b00956e5 2024-04-29)

ic-repl 0.7.4

ic-wasm 0.7.1
Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

	pub_binary_size	sub_binary_size	subscribe_caller	subscribe_callee	publish_caller	publish_callee
Motoko	161_375	145_826	28_593	11_963	22_854	6_446
Rust	570_196	606_522	58_150	38_553	72_692	42_681

Environment

dfx 0.20.2-beta.0

Motoko compiler 0.11.1 (source i511mdc8-vdy6m3ag-h4bnb9rr-1v24n9jc)

rustc 1.78.0 (9b00956e5 2024-04-29)

ic-repl 0.7.4

ic-wasm 0.7.1

dfinity / canister-profiling

Use PocketIC for CI #115

Priority queue

Growable array

Stable structures

Statistics

SHA-2

Certified map

Statistics

Basic DAO

DIP721 NFT

Statistics

Heartbeat

Timer

Statistics

Garbage Collection

Actor class

Statistics

Publisher & Subscriber

Statistics

Overall Statistics

Collection libraries

💎 Takeaways

Map

Priority queue

Growable array

Stable structures

Environment

Cryptographic libraries

SHA-2

Certified map

Environment

Sample Dapps

Basic DAO

DIP721 NFT

Environment

Heartbeat / Timer

Heartbeat

Timer

Environment

Motoko Specific Benchmarks

Garbage Collection

Actor class

Environment

Publisher & Subscriber

Environment