subdgtl / WFC

Prototype tools for playing with Wave Function Collapse
The Unlicense
5 stars 1 forks source link

Optimize bitvec #14

Closed yanchith closed 3 years ago

yanchith commented 3 years ago

This speeds up the solver by 28% (measured in macro-benchmarks with time cargo run --release ./samples/sample3.csv -x 100 -y 100 -z 100. The optimization was guided by profiling.

BitVec::len is called very often by the solver, and internally it was calling u64::count_ones intrinsics which showed up significantly on the flamegraph. The patch changes BitVec::len to instead return a cached length of the bit vector. The length is stored in the last eight bits of data, so our module count decreased from 256 to 248.

A couple of unused methods were removed from BitVec so that they didn't have to be adapted to the new data layout.

~The PR is still draft, because I didn't test it integrated with GH just yet (I was on linux, because it getting a profiler to work under windows is trying). I'll test in GH soon and then undraft the PR, but it can be reviewed meanwhile.~

UPDATE: Tested with grasshopper integration.

yanchith commented 3 years ago

The 27%-28% speedup is now confirmed on Windows, but windows in about 17% slower than Linux on the same machine 👀 Might make sense to play with some build options in the future.

yanchith commented 3 years ago

Just increased the speedup to around 34% by tweaking the release build profile. I turned on fat LTO and decreased codegen units count to 1. Release builds now take insanely long, but that's just how life is.