Closed yanchith closed 3 years ago
The 27%-28% speedup is now confirmed on Windows, but windows in about 17% slower than Linux on the same machine 👀 Might make sense to play with some build options in the future.
Just increased the speedup to around 34% by tweaking the release build profile. I turned on fat LTO and decreased codegen units count to 1. Release builds now take insanely long, but that's just how life is.
This speeds up the solver by 28% (measured in macro-benchmarks with
time cargo run --release ./samples/sample3.csv -x 100 -y 100 -z 100
. The optimization was guided by profiling.BitVec::len
is called very often by the solver, and internally it was callingu64::count_ones
intrinsics which showed up significantly on the flamegraph. The patch changesBitVec::len
to instead return a cached length of the bit vector. The length is stored in the last eight bits of data, so our module count decreased from 256 to 248.A couple of unused methods were removed from
BitVec
so that they didn't have to be adapted to the new data layout.~The PR is still draft, because I didn't test it integrated with GH just yet (I was on linux, because it getting a profiler to work under windows is trying). I'll test in GH soon and then undraft the PR, but it can be reviewed meanwhile.~
UPDATE: Tested with grasshopper integration.