jameslittle230 / stork

🔎 Impossibly fast web search, made for static sites.
https://stork-search.net
Apache License 2.0
2.73k stars 56 forks source link

Improve HTML content extraction behavior #281

Closed jameslittle230 closed 2 years ago

jameslittle230 commented 2 years ago

New functionality:

Also adds ignored tests that define the ideal behavior when you have an inclusion selector inside an inclusion selector. Currently this is not handled well - the content is duplicated.

codecov[bot] commented 2 years ago

Codecov Report

Merging #281 (c937234) into master (eeaca67) will increase coverage by 0.80%. The diff coverage is 88.23%.

@@            Coverage Diff             @@
##           master     #281      +/-   ##
==========================================
+ Coverage   72.44%   73.25%   +0.80%     
==========================================
  Files          53       52       -1     
  Lines        2174     2191      +17     
  Branches      104      104              
==========================================
+ Hits         1575     1605      +30     
+ Misses        598      585      -13     
  Partials        1        1              
Impacted Files Coverage Δ
stork-cli/src/clap.rs 100.00% <ø> (ø)
stork-cli/src/main.rs 0.00% <0.00%> (ø)
stork-lib/src/index_v3/build/fill_stems.rs 100.00% <ø> (ø)
stork-lib/src/index_v3/mod.rs 70.37% <ø> (ø)
stork-lib/src/index_v3/build/errors.rs 80.95% <50.00%> (+0.46%) :arrow_up:
...rc/index_v3/build/fill_intermediate_entries/mod.rs 90.99% <75.00%> (+0.99%) :arrow_up:
stork-lib/src/index_v3/build/mod.rs 96.42% <89.47%> (-2.15%) :arrow_down:
js/config.ts 82.35% <100.00%> (+1.10%) :arrow_up:
js/entity.ts 65.45% <100.00%> (ø)
stork-lib/src/index_v3/build/fill_containers.rs 91.80% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update de70fb0...c937234. Read the comment docs.

github-actions[bot] commented 2 years ago

Benchmarks

BenchmarkBaselineContenderComparison
build/federalist213.0768215.14051.01×
federalist.st1125.4561125.4561.0×
search/federalist/liberty2.0132.06361.03×
stork.js21.96121.9611.0×
stork.wasm356.537356.5371.0×

Baseline: de70fb01688725b7955aa8a48b4fda7ef8be7993; Comparison: c937234d8cd7b2521d0cba149a2d286e4230b857