egraphs-good / egglog

egraphs + datalog!
https://egraphs-good.github.io/egglog/
MIT License
458 stars 54 forks source link

Fix sources of nondeterminism in egglog #439

Closed oflatt closed 2 weeks ago

oflatt commented 1 month ago

@ezrosent, I ended up having to use indexmap in generic join, which might slow down egglog. Should we just merge it for now? I would love to add testing to egglog that somehow catches nondeterminism as well.

yihozhang commented 1 month ago

Can we instead just use a build-time flag which users can use to opt in for determinism?

oflatt commented 1 month ago

That's a good idea, but perhaps determinism should be opt-out?

yihozhang commented 1 month ago

Some thoughts:

oflatt commented 1 month ago

Reverted the feature, so this should be good to go! Excited to see better performance testing in the future.

Alex-Fischman commented 1 month ago

@oflatt Update to main to get benchmarking!

codspeed-hq[bot] commented 1 month ago

CodSpeed Performance Report

Merging #439 will degrade performances by 14.5%

Comparing oflatt-nondeterminism-fix (1b550ab) with main (60342af)

Summary

❌ 6 regressions ✅ 2 untouched benchmarks

:warning: Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main oflatt-nondeterminism-fix Change
eggcc-extraction 4.1 s 4.4 s -7.45%
herbie 282.6 ms 304.8 ms -7.28%
lambda 145.1 ms 159.2 ms -8.89%
math-microbenchmark 3.6 s 4.2 s -14.5%
python_array_optimize 6.1 s 6.9 s -11.25%
typeinfer 408.7 ms 433.5 ms -5.72%
saulshanabrook commented 1 month ago

I imagine some of the changes on shorter running benchmarks might not be significant - FWIW I just opened a PR to reduce our benchmarks to just the longer more meaningful ones (https://github.com/egraphs-good/egglog/pull/444). Otherwise, it seems like there is too much uncertainty introduced by the indeterminacy of the memory allocator.

The slowdown in the math-microbenchmark is possibly more relevant.

oflatt commented 4 weeks ago

We should try:

saulshanabrook commented 4 weeks ago

reference to ahash as fast hash map with ability to set seed https://docs.rs/ahash/latest/ahash/

thaliaarchi commented 4 weeks ago

reference to ahash as fast hash map with ability to set seed https://docs.rs/ahash/latest/ahash/

hashbrown moved from ahash to foldhash in the past release, so that seems to be the new consensus. foldhash has a fixed seed initializer too: foldhash::fast::FixedState::with_seed(0). For comparison, I switched symbol_table (used by egglog) to use this.

oflatt commented 3 weeks ago

Not sure why, but things are still non-deterministic with this change https://github.com/egraphs-good/eggcc/actions/runs/11525853271/job/32088922445?pr=637

thaliaarchi commented 3 weeks ago

Since I updated the hash function in PR#445 Update hashbrown, I'd recommend you rebase onto main. My goal there was to update to the latest versions of the hashing dependencies.

I didn't notice that egglog also uses rustc-hash, so that ought to be updated, since they recently did significant improvements improving both performance and hash strength. It's no longer the Firefox-derived hash function; its collision weaknesses, which were problematic for rustc, have been fixed. Now, rustc-hash is a mildly stripped down version of foldhash, making tradeoffs for usage in a compiler (I've written on the differences between them in my notes). It's probably only worth using one of the two.

Update: fixed in PR#456 Update symbol_table. Max's symbol_table has some upstream changes which haven't been released to crates.io, so the version used by egglog is still on an old version of hashbrown and is also missing performance improvements contributed by someone else. For that, we should either ping him again or switch to a Git dependency on it in Cargo.toml.

ezrosent commented 3 weeks ago

One thought:

Another source of nondeterminism that can happen if you're running more than one instance of egglog in a single process is symbol_table. The numbers symbol_table hands out will not be deterministic (there are various process-global counters being incremented and the like. Different numbers will cause hashmap to store strings or symbols in different spots, hence iteration order will be different too.

oflatt commented 3 weeks ago

@thaliaarchi thanks for the suggestions

@ezrosent hmm yeah, that sounds like it would do it. I assumed wrongly that symbol gen was deterministic.

Sounds like I should make a "nondeterministic" feature flag for now, and we can revisit this question at a later date. The best option is probably to somehow sort the resulting serialized egraph, since that would work even with parallel egglog.

oflatt commented 2 weeks ago

Regression is caused by the swap from hashmap to index maps, which have a slower insertion and lookup speed. To recover the performance, use the nondeterministic flag.

Looking to the future, we'd like to remove the dependency on symbol gen library and use the fixed seed solution.