scylladb / cql-stress

10 stars 5 forks source link

Link time optimization (LTO) #31

Open cvybhu opened 2 years ago

cvybhu commented 2 years ago

lto could be enabled for the release build. This would slightly increase performance at the cost of longer build times: https://doc.rust-lang.org/cargo/reference/profiles.html#lto

cql-stress tries to use every bit of performance available so I think it's a good idea to add it.

The only issue is that increased build times might be annoying to work with when testing new changes. To deal with this optimizations could be enabled for the dev profile, or even a new profile could be created, something like release-no-lto(?)

cvybhu commented 2 years ago

Running a simple test has shown ~3.5% performance increase with lto enabled.

The test:
    cargo run --release --bin cql-stress-scylla-bench -- -workload sequential -mode write -nodes 127.0.0.1

Without lto:
    93035 ops/s

With lto:
    96248 ops/s
cvybhu commented 2 years ago

The build time increase is pretty brutal. I measured how long it takes to build the project again after making a small change (changed string in main.rs).

No lto:
    Finished release [optimized] target(s) in 7.93

With lto:
    Finished release [optimized] target(s) in 29.47s

That's a 4x increase, we might need a separate profile after all.

piodul commented 2 years ago

Hmm, while having every possible runtime speedup, development speed matters, too. 20 more seconds to compile is quite a high price for just 3.5% speedup. Due to the nature of the project, both dev and release profiles are useful: dev for unit tests, and release for measuring performance. I'd rather not introduce another profile as new developers won't be aware of it, and it's harder to type --profile release-no-lto than --release. Adding optimizations to dev will slow running unit tests with cargo test.

Every solution here seems to have some drawbacks, and I'm not sure which course of action would be the least painful. Let's postpone this task until we really need to squeeze out every possible cycle, and meanwhile concentrate on optimizing other things.

mykaul commented 7 months ago

Worth revisiting with newer rustc, and the different LTO variants (fat, thin) - https://doc.rust-lang.org/cargo/reference/profiles.html#lto to see both compilation and runtime impact.