Open crusso opened 1 year ago
Note Diffing the performance result against the published result from main branch. Unchanged benchmarks are omitted.
binary_size | generate 1m | max mem | batch_get 50 | batch_put 50 | batch_remove 50 | upgrade | |
---|---|---|---|---|---|---|---|
hashmap | 160_146 ($\textcolor{green}{-0.05\%}$) | 8_415_853_637 ($\textcolor{red}{20.50\%}$) | 61_987_852 | 344_668 ($\textcolor{red}{19.40\%}$) | 6_554_314_393 ($\textcolor{red}{18.38\%}$) | 372_027 ($\textcolor{red}{19.93\%}$) | 11_086_889_804 ($\textcolor{red}{21.45\%}$) |
triemap | 163_317 ($\textcolor{green}{-0.10\%}$) | 14_814_581_233 ($\textcolor{red}{29.23\%}$) | 74_216_172 | 291_744 ($\textcolor{red}{30.87\%}$) | 714_312 ($\textcolor{red}{30.01\%}$) | 700_163 ($\textcolor{red}{29.61\%}$) | 16_814_045_622 ($\textcolor{red}{28.60\%}$) |
rbtree | 157_922 ($\textcolor{green}{-0.14\%}$) | 7_517_002_376 ($\textcolor{red}{25.72\%}$) | 57_996_060 | 113_955 ($\textcolor{red}{28.18\%}$) | 339_364 ($\textcolor{red}{26.36\%}$) | 362_212 ($\textcolor{red}{30.13\%}$) | 7_125_606_793 ($\textcolor{red}{23.45\%}$) |
splay | 159_759 ($\textcolor{green}{-0.12\%}$) | 15_011_765_466 ($\textcolor{red}{29.77\%}$) | 53_995_996 | 719_957 ($\textcolor{red}{30.42\%}$) | 758_215 ($\textcolor{red}{30.33\%}$) | 1_051_199 ($\textcolor{red}{29.73\%}$) | 4_732_822_678 ($\textcolor{red}{27.14\%}$) |
btree | 187_666 ($\textcolor{green}{-0.12\%}$) | 10_705_101_825 ($\textcolor{red}{30.17\%}$) | 31_104_012 | 365_187 ($\textcolor{red}{31.58\%}$) | 503_327 ($\textcolor{red}{31.02\%}$) | 559_436 ($\textcolor{red}{30.39\%}$) | 3_120_410_218 ($\textcolor{red}{23.93\%}$) |
zhenya_hashmap | 160_516 ($\textcolor{red}{0.00\%}$) | 2_776_814_274 ($\textcolor{red}{26.13\%}$) | 22_773_100 | 64_709 ($\textcolor{red}{33.07\%}$) | 82_032 ($\textcolor{red}{32.65\%}$) | 94_090 ($\textcolor{red}{32.76\%}$) | 3_319_787_035 ($\textcolor{red}{23.16\%}$) |
btreemap_rs | 477_037 ($\textcolor{green}{-0.12\%}$) | 1_793_694_183 ($\textcolor{red}{8.60\%}$) | 27_590_656 | 75_427 ($\textcolor{red}{12.81\%}$) | 125_068 ($\textcolor{red}{11.19\%}$) | 86_003 ($\textcolor{red}{12.81\%}$) | 2_927_540_279 ($\textcolor{red}{10.02\%}$) |
imrc_hashmap_rs | 479_206 ($\textcolor{green}{-0.12\%}$) | 2_580_421_602 ($\textcolor{red}{7.84\%}$) | 244_973_568 | 38_236 ($\textcolor{red}{16.70\%}$) | 178_903 ($\textcolor{red}{9.59\%}$) | 114_251 ($\textcolor{red}{16.12\%}$) | 5_754_488_547 ($\textcolor{red}{10.84\%}$) |
hashmap_rs | 467_420 ($\textcolor{green}{-0.12\%}$) | 432_765_549 ($\textcolor{red}{7.31\%}$) | 73_138_176 | 21_704 ($\textcolor{red}{28.80\%}$) | 26_758 ($\textcolor{red}{23.42\%}$) | 25_073 ($\textcolor{red}{23.74\%}$) | 1_280_477_959 ($\textcolor{red}{11.85\%}$) |
binary_size | heapify 1m | max mem | pop_min 50 | put 50 | pop_min 50.1 | upgrade | |
---|---|---|---|---|---|---|---|
heap | 147_633 ($\textcolor{green}{-0.00\%}$) | 6_206_169_472 ($\textcolor{red}{32.48\%}$) | 29_995_956 | 680_841 ($\textcolor{red}{33.11\%}$) | 250_925 ($\textcolor{red}{34.57\%}$) | 648_731 ($\textcolor{red}{33.15\%}$) | 3_293_282_643 ($\textcolor{red}{24.01\%}$) |
heap_rs | 463_292 ($\textcolor{green}{-0.12\%}$) | 138_669_631 ($\textcolor{red}{14.04\%}$) | 18_284_544 | 57_362 ($\textcolor{red}{11.04\%}$) | 23_196 ($\textcolor{red}{27.14\%}$) | 57_496 ($\textcolor{red}{10.99\%}$) | 505_960_753 ($\textcolor{red}{14.80\%}$) |
binary_size | generate 5k | max mem | batch_get 500 | batch_put 500 | batch_remove 500 | upgrade | |
---|---|---|---|---|---|---|---|
buffer | 150_978 ($\textcolor{green}{-0.02\%}$) | 2_718_410 ($\textcolor{red}{30.53\%}$) | 65_644 | 104_997 ($\textcolor{red}{43.65\%}$) | 858_376 ($\textcolor{red}{27.83\%}$) | 172_497 ($\textcolor{red}{35.19\%}$) | 3_141_451 ($\textcolor{red}{26.95\%}$) |
vector | 152_598 ($\textcolor{red}{0.03\%}$) | 2_131_660 ($\textcolor{red}{34.21\%}$) | 24_580 | 141_137 ($\textcolor{red}{34.17\%}$) | 203_207 ($\textcolor{red}{35.53\%}$) | 195_769 ($\textcolor{red}{32.19\%}$) | 4_756_048 ($\textcolor{red}{23.71\%}$) |
vec_rs | 459_107 ($\textcolor{green}{-0.12\%}$) | 288_898 ($\textcolor{red}{8.74\%}$) | 1_310_720 | 17_148 ($\textcolor{red}{31.77\%}$) | 30_473 ($\textcolor{red}{20.15\%}$) | 25_609 ($\textcolor{red}{20.53\%}$) | 3_141_188 ($\textcolor{red}{14.48\%}$) |
Warning Skip table 3 ## Stable structures from _out/collections/README.md, due to table shape mismatches from main branch.
binary_size: -0.08% [-0.11%, -0.05%]
max_mem: no change
cycles: 24.15% [22.42%, 25.89%]
binary_size: -0.08% [-0.11%, -0.05%]
max_mem: no change
cycles: 24.15% [22.42%, 25.89%]
Note The flamegraph link only works after you merge. Unchanged benchmarks are omitted.
Measure different collection libraries written in both Motoko and Rust.
The library names with _rs
suffix are written in Rust; the rest are written in Motoko.
The _stable
and _stable_rs
suffix represents that the library directly writes the state to stable memory using Region
in Motoko and ic-stable-stuctures
in Rust.
We use the same random number generator with fixed seed to ensure that all collections contain the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:
rts_max_heap_size
after generate
call; For Rust, it reports the Wasm's memory page * 64Kb; For stable benchmarks, it reports the region size of the stable memory storing the map.Note
- The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
- Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.
- The
upgrade
column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is thatic-stable-structures
tends to be slower than the region memory in Motoko.- Different library has different ways for persisting data during upgrades, there are mainly three categories:
- Use stable variable directly in Motoko:
zhenya_hashmap
,btree
,vector
- Expose and serialize external state (
share/unshare
in Motoko,candid::Encode
in Rust):rbtree
,heap
,btreemap_rs
,hashmap_rs
,heap_rs
,vector_rs
- Use pre/post-upgrade hooks to convert data into an array:
hashmap
,splay
,triemap
,buffer
,imrc_hashmap_rs
- The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.
hashmap
uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost ofbatch_put 50
is much higher than other data structures.btree
comes from mops.one/stableheapbtreemap.btree_stable
comes from github.com/sardariuss.zhenya_hashmap
comes from mops.one/map.vector
comes from mops.one/vector. Compare withbuffer
,put
has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$);get
has a slightly larger constant overhead.hashmap_rs
uses thefxhash
crate, which is the same asstd::collections::HashMap
, but with a deterministic hasher. This ensures reproducible result.imrc_hashmap_rs
uses theim-rc
crate, which is the immutable version hashmap in Rust.
binary_size | generate 1m | max mem | batch_get 50 | batch_put 50 | batch_remove 50 | upgrade | |
---|---|---|---|---|---|---|---|
hashmap | 160_146 | 8_415_853_637 | 61_987_852 | 344_668 | 6_554_314_393 | 372_027 | 11_086_889_804 |
triemap | 163_317 | 14_814_581_233 | 74_216_172 | 291_744 | 714_312 | 700_163 | 16_814_045_622 |
rbtree | 157_922 | 7_517_002_376 | 57_996_060 | 113_955 | 339_364 | 362_212 | 7_125_606_793 |
splay | 159_759 | 15_011_765_466 | 53_995_996 | 719_957 | 758_215 | 1_051_199 | 4_732_822_678 |
btree | 187_666 | 10_705_101_825 | 31_104_012 | 365_187 | 503_327 | 559_436 | 3_120_410_218 |
zhenya_hashmap | 160_516 | 2_776_814_274 | 22_773_100 | 64_709 | 82_032 | 94_090 | 3_319_787_035 |
btreemap_rs | 477_037 | 1_793_694_183 | 27_590_656 | 75_427 | 125_068 | 86_003 | 2_927_540_279 |
imrc_hashmap_rs | 479_206 | 2_580_421_602 | 244_973_568 | 38_236 | 178_903 | 114_251 | 5_754_488_547 |
hashmap_rs | 467_420 | 432_765_549 | 73_138_176 | 21_704 | 26_758 | 25_073 | 1_280_477_959 |
binary_size | heapify 1m | max mem | pop_min 50 | put 50 | pop_min 50 | upgrade | |
---|---|---|---|---|---|---|---|
heap | 147_633 | 6_206_169_472 | 29_995_956 | 680_841 | 250_925 | 648_731 | 3_293_282_643 |
heap_rs | 463_292 | 138_669_631 | 18_284_544 | 57_362 | 23_196 | 57_496 | 505_960_753 |
binary_size | generate 5k | max mem | batch_get 500 | batch_put 500 | batch_remove 500 | upgrade | |
---|---|---|---|---|---|---|---|
buffer | 150_978 | 2_718_410 | 65_644 | 104_997 | 858_376 | 172_497 | 3_141_451 |
vector | 152_598 | 2_131_660 | 24_580 | 141_137 | 203_207 | 195_769 | 4_756_048 |
vec_rs | 459_107 | 288_898 | 1_310_720 | 17_148 | 30_473 | 25_609 | 3_141_188 |
binary_size | generate 50k | max mem | batch_get 50 | batch_put 50 | batch_remove 50 | upgrade | |
---|---|---|---|---|---|---|---|
btree | 187_666 | 457_750_863 | 1_554_152 | 289_410 | 442_003 | 479_799 | 155_920_331 |
btree_stable | 205_452 | 17_160_920_654 | 2_621_440 | 13_928_890 | 18_985_993 | 37_332_002 | 40_832 |
btreemap_rs | 477_037 | 76_218_029 | 2_555_904 | 64_985 | 97_003 | 84_994 | 125_790_335 |
btreemap_stable_rs | 478_103 | 4_751_143_101 | 2_621_440 | 2_844_949 | 5_181_254 | 8_809_593 | 731_069 |
heap_rs | 463_292 | 7_001_559 | 2_293_760 | 49_871 | 23_444 | 49_845 | 26_519_303 |
heap_stable_rs | 450_480 | 317_932_953 | 458_752 | 2_667_588 | 276_814 | 2_647_531 | 731_182 |
vec_rs | 459_107 | 3_079_223 | 2_228_224 | 17_148 | 18_323 | 17_850 | 24_471_685 |
vec_stable_rs | 445_471 | 74_445_158 | 458_752 | 69_489 | 90_685 | 92_681 | 731_195 |
Environment
- dfx 0.15.1
- Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)
- rustc 1.73.0 (cc66ad468 2023-10-03)
- ic-repl 0.5.1
- ic-wasm 0.6.0
Sames as #97 but using unoptimized version of stable btree (to see perf improvements relative to #97.