Closed thaliaarchi closed 1 month ago
Comparing thaliaarchi:update-hashbrown
(90e6e69) with main
(b0db068)
✅ 6
untouched benchmarks
+1 to not letting "performance regressions" in the parser block this PR
I've pushed a commit, which replaces symbol_table with my fork that bumps their version of hashbrown. It's a hack to run benchmarks in CI. I'll then submit a PR to symbol_table, depending on the results, and rebase here.
It looks like running benchmarks isn't automatic; could you trigger another run?
@thaliaarchi I believe the benchmarks have run on your most recent push! The benchmark comment gets updated whenever a new run is processed in this branch.
It seems like the only regression is in cykjson
. I wouldn't personally let that block merging in this PR, but I am not very familiar with what that example is for.
cykjson is a small cool egglog example that does the CYK parsing algorithm of JSON-like strings. It is a more Datalog-like workload (dynamic programming) with some e-class manipulations
I dropped the commit for benchmarking updating symbol_table
. If there's a response there later, it can be updated here separately.
I also changed fn TermDag::get(&self, id: TermId) -> Term
to return &Term
instead. All internal usages use the result by reference, so cloning is unnecessary. Does this affect anything externally?
I also changed fn TermDag::get(&self, id: TermId) -> Term to return &Term instead. All internal usages use the result by reference, so cloning is unnecessary. Does this affect anything externally?
This is good, egglog is unstable and users can clone if they need it.
Thanks @thaliaarchi for working on this and responding to all the feedback! If you have anything to add, we are also discussing the tradeoffs with hash performance and determinism in this post: https://github.com/egraphs-good/egglog/pull/439#issuecomment-2435724938
EDIT: It looks like these changes also caused a 7% speedup in the biggest benchmark (added to main after this PR was started, so wasn't included in the comparison here), which is pretty nice! https://codspeed.io/egraphs-good/egglog/runs/671a868380493f6bc05c7bfc
@saulshanabrook Thanks! I'm glad to see such speedups!
hashbrown
0.15 was released this month, which notably removed theRawTable
API in favor ofHashTable
and changed its hasher fromahash
to the fasterfoldhash
.HashTable
indexmap
to bump itshashbrown
dependencyTermDag
toindexmap::IndexMap
Besides this, Max's
symbol_table
still requires an older version ofhashbrown
in the lockfile. I have a draft to update that and could rebase to include it, but didn't figure out how to run its Criterion benchmarks.