StractOrg / stract

web search done right
https://stract.com
GNU Affero General Public License v3.0
1.94k stars 43 forks source link

📦 Crateification #112

Open oeb25 opened 8 months ago

oeb25 commented 8 months ago

This is an experiment to see how splitting core into multiple crates will affect compile times. Whether or not we want this structure is very TBD.

~Currently contains the commits of #111.~

flowchart TB
  alice --> stdx
  alice --> stract_config
  alice --> stract_llm
  collector --> schema
  collector --> simhash
  collector --> stdx
  collector --> stract_config
  crawler --> distributed
  crawler --> hyperloglog
  crawler --> kv
  crawler --> sonic
  crawler --> stdx
  crawler --> stract_config
  crawler --> warc
  crawler --> webgraph
  crawler --> webpage
  entity_index --> imager
  entity_index --> kv
  entity_index --> stdx
  entity_index --> tokenizer
  imager --> distributed
  imager --> kv
  imager --> stdx
  mapreduce --> distributed
  mapreduce --> sonic
  naive_bayes --> stdx
  schema --> stdx
  schema --> tokenizer
  simhash --> tokenizer
  spell --> schema
  spell --> stdx
  spell --> stract_query
  stract_cli --> stract_config
  stract_cli --> stract_core
  stract_cli --> webgraph
  stract_config --> distributed
  stract_core --> alice
  stract_core --> collector
  stract_core --> crawler
  stract_core --> distributed
  stract_core --> entity_index
  stract_core --> executor
  stract_core --> hyperloglog
  stract_core --> imager
  stract_core --> kuchiki
  stract_core --> kv
  stract_core --> mapreduce
  stract_core --> naive_bayes
  stract_core --> optics
  stract_core --> schema
  stract_core --> simhash
  stract_core --> sonic
  stract_core --> spell
  stract_core --> stdx
  stract_core --> stract_config
  stract_core --> stract_llm
  stract_core --> stract_query
  stract_core --> tokenizer
  stract_core --> warc
  stract_core --> webgraph
  stract_core --> webpage
  stract_llm --> stdx
  stract_query --> optics
  stract_query --> schema
  stract_query --> stdx
  tokenizer --> stdx
  webgraph --> executor
  webgraph --> hyperloglog
  webgraph --> kv
  webgraph --> stdx
  webpage --> kuchiki
  webpage --> naive_bayes
  webpage --> schema
  webpage --> simhash
  webpage --> stdx
  webpage --> tokenizer
  webpage --> webgraph
oeb25 commented 8 months ago

Current cargo check-incremental improvements:

Command Mean [s] Min [s] Max [s] Relative
update-deps 2.414 ± 0.020 2.389 2.454 1.56 ± 0.03
crateification 1.543 ± 0.021 1.506 1.587 1.00

And for cargo build-incremental:

Command Mean [s] Min [s] Max [s] Relative
update-deps 8.001 ± 0.030 7.971 8.058 1.20 ± 0.02
crateification 6.655 ± 0.135 6.389 6.825 1.00
oeb25 commented 8 months ago

Current cargo build-incremental:

Command Mean [s] Min [s] Max [s] Relative
main 7.504 ± 0.049 7.439 7.586 1.30 ± 0.02
crateification 5.771 ± 0.069 5.699 5.936 1.00