jameslittle230 / stork

🔎 Impossibly fast web search, made for static sites.
https://stork-search.net
Apache License 2.0
2.73k stars 56 forks source link

Convert all HashMap usage to BTreeMap #265

Closed jameslittle230 closed 2 years ago

jameslittle230 commented 2 years ago

Fixes #261

❯ while TRUE; cargo run -- build --input local-dev/test-configs/federalist.toml --output test.st 2> /dev/null && sha256sum test.st; end
05ece8e36d4c1abc4362f3969c304288a26500a81df05d37f324ba709581e955  test.st
05ece8e36d4c1abc4362f3969c304288a26500a81df05d37f324ba709581e955  test.st
05ece8e36d4c1abc4362f3969c304288a26500a81df05d37f324ba709581e955  test.st
05ece8e36d4c1abc4362f3969c304288a26500a81df05d37f324ba709581e955  test.st

This is achieved by using a BTreeMap instead of a Hashmap in the Index data structure. This looks like it slows down build times a lot; I'm not sure what effect it has on search times.

Alternatively, I could only convert to a BTreeMap when it comes time to serialize, as described in https://stackoverflow.com/a/42723390 - let's see how metrics look first though.

codecov[bot] commented 2 years ago

Codecov Report

Merging #265 (c5e135a) into master (eeaca67) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #265   +/-   ##
=======================================
  Coverage   72.44%   72.44%           
=======================================
  Files          53       53           
  Lines        2174     2174           
  Branches      104      104           
=======================================
  Hits         1575     1575           
  Misses        598      598           
  Partials        1        1           
Impacted Files Coverage Δ
stork-lib/src/index_v3/build/fill_stems.rs 100.00% <ø> (ø)
stork-lib/src/index_v3/mod.rs 70.37% <ø> (ø)
stork-lib/src/index_v3/build/fill_containers.rs 91.80% <100.00%> (ø)
stork-lib/src/index_v3/build/mod.rs 98.57% <100.00%> (ø)
stork-lib/src/index_v3/search/mod.rs 95.77% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a7b8bdd...c5e135a. Read the comment docs.

github-actions[bot] commented 2 years ago

Benchmarks

BenchmarkBaselineContenderComparison
build/federalist202.9417220.07451.08×
federalist.st1125.4561125.4561.0×
search/federalist/liberty1.92741.96481.02×
stork.js21.8821.881.0×
stork.wasm345.002356.5231.03×

Baseline: a7b8bdd69115b812dd9c44909139b146157f6038; Comparison: c5e135aa2f3eeb01501d4568498b36dd989ec7ec