FuelLabs / fuel-core

Rust full node implementation of the Fuel v2 protocol.
Other
57.95k stars 2.79k forks source link

investigate improvements to linker performance #529

Open Voxelot opened 2 years ago

Voxelot commented 2 years ago

There are a lot of techniques we could leverage to improve fuel-core compile times, such as dynamically linking large dependencies like RocksDB or using a different linker. Bevy has a lot of resources we might be able to leverage.

https://bevyengine.org/learn/book/getting-started/setup/

Voxelot commented 2 years ago

cc @Dentosal

Dentosal commented 2 years ago

Dynamically linking librocksdb-sys crate has most potential to reduce compile times. It contains bindgen-generated bindings, and having a wrapper crate with dylib type might help, see https://github.com/bevyengine/bevy/blob/main/crates/bevy_dylib/Cargo.toml

Dentosal commented 2 years ago

Some notes on attempted optimizations

Testing on:

Using zld instead of system ld

zld linker was used with generic sharing, i.e. config:

[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/opt/homebrew/bin/zld", "-Zshare-generics=y"]

Benchmarks

Average of three runs after one warm-up round. Results never had more than 5% difference.

Linker Clean build Incremental (whitespace change)
ld 153s 5.44s
zld 162s 3.75s

As we can see, ld is faster for clean builds, and zld for incremental builds. The takeaway is that ld should be used on CI, and developers can locally enable zld

Linker splitting

[profile.dev]
split-debuginfo = "unpacked"

This is supposted to improve build times on macOS by splitting debuginfo to a separate file. However, this had no effect on my machine.

Some ideas on future optimization possibilities

ethers-* and ring

Multiple ethers-* dependencies depend on reqwest, whihc in turn transitively depends on ring, which is quite slow to compile. It might be possible to replace this with a lighter http library, or force using only the system openssl.

async_graphql

This crate is super slow to compile, maybe profiling it would help. Although I find that particular library to be awful enough that simply not using GraphQL at all would be a better option.

clap and other proc-macros

Proc-macro crates are rather slow wiht compilation. Watt could help with that.

Different versions of same crate in dependency tree

All copies need to be compiled separately. There are not many of these, but eliminating those would still be nice.

netrome commented 1 month ago

I did some research on this over the weekend and it turns out we can dynamically link RocksDB by just setting two env variables as per https://github.com/rust-rocksdb/rust-rocksdb/issues/310#issuecomment-1537137868. This assumes you have the shared library installed in your environment.

Here's a comparison on my machine with and without these env variables:

export ROCKSDB_LIB_DIR=/usr/lib/
export SNAPPY_LIB_DIR=/usr/lib/
Clean build (debug) Target directory size Incremental build[^1]
Dynamic linking (with above env variables) 3m 36s 8.7 GB 16s
Static linking (without above env variables) 5m 50s 11 GB 17s

The difference is even more visible on my build server (with 16 cores / 32 threads):

Clean build (debug) Target directory size Incremental build[^1]
Dynamic linking (with above env variables) 1m 20s 8.8 GB 10s
Static linking (without above env variables) 1m 36s 12 GB 10s

[^1]: Added a comment to crates/fuel-core/src/state/rocks_db.rs

So while this doesn't make a difference for incremental builds, because we don't need to recompile rocksdb, it does speed up clean builds quite a bit which can make local development smoother. Though we likely still want to stick with static linking in CI to reduce potential bugs in version mismatch with the shared library.

xgreenx commented 1 month ago

Have you tried to compare it with our fork of rocked for local development? https://github.com/FuelLabs/rust-rocksdb

I remember when I tested it, I had a huge difference especially if you run ./ci_checks.rs

netrome commented 1 month ago

Have you tried to compare it with our fork of rocked for local development? https://github.com/FuelLabs/rust-rocksdb

I remember when I tested it, I had a huge difference especially if you run ./ci_checks.rs

I'm not observing any improvements for clean or incremental builds, which is expected since the caching shouldn't make a difference in those cases. I just discovered the ./ci_checks.rs doesn't work with my rust version so I need to mess around a bit with that to measure it, but it makes sense that the caching should provide a big benefit there since there's a lot of different cargo commands. I wonder if the caching or dynamic linking will be fastest in this scenario 🤔 I'll get back when I have some results.

netrome commented 1 month ago

I'm not observing any improvements for clean or incremental builds, which is expected since the caching shouldn't make a difference in those cases. I just discovered the ./ci_checks.rs doesn't work with my rust version so I need to mess around a bit with that to measure it, but it makes sense that the caching should provide a big benefit there since there's a lot of different cargo commands. I wonder if the caching or dynamic linking will be fastest in this scenario 🤔 I'll get back when I have some results.

Experiencing some issues running the CI checks on my build server. Need to investigate deeper to find out if something is wrong with my setup, because they seem to run better on my laptop.