FrameworkComputer / inputmodule-rs

Framework Laptop 16 Input Module SW/FW
MIT License
223 stars 24 forks source link

Reduce build times #33

Closed jdswensen closed 1 year ago

jdswensen commented 1 year ago

Partially addresses https://github.com/FrameworkComputer/inputmodule-rs/issues/19.

I've got some good news and I've got some bad news. :smiling_face_with_tear:

While experimenting, I was running variations of this command to get the clean build timing:

cargo clean && cargo build --release --target x86_64-unknown-linux-gnu -p inputmodule-control

Let me know if that's the wrong command. Hopefully the changes help. :crossed_fingers:

Good News I was able to reduce the build time of inputmodule-control by half on my machine (60 seconds -> 27 seconds). Using the --timings flag during the build, I tracked down a big chunk of time to the image crate. I disabled all default features and enabled only bmp, png, and gif since that's what I see looking around the repo. clap is the other big one, but the default features seemed necessary for what the SW tool is doing.

You can likely further reduce your build times if you set up sccache locally. You can install it with cargo install sccache. Then, when you go to build, you just need to add the RUSTC_WRAPPER envvar. For example:

RUSTC_WRAPPER="${HOME}/.cargo/bin/sccache" cargo build --release --target x86_64-unknown-linux-gnu -p inputmodule-control

I briefly looked at setting up sccache in github actions, but didn't find any great examples (and also saw mixed feelings about using it).

Bad News Further build time reduction is possible, but requires a larger effort. For example, you can get a much lower time with:

[profile.release]
debug = 2
incremental = true
lto = 'thin'
opt-level = 0
split-debuginfo = "packed"

That modification in combination with a local sccache, I got the build down to about 4 seconds on a 3950x (6 if you don't use sccache).

The reason I think it will take a larger effort for further reduction is because the crate will likely need to be reworked into either two separate repos or two separate cargo workspaces. I can file this as a new issue if desired.

There are a couple of factors:

  1. The build definitions are tangled. When you do a cargo build for either a mcu or the host app, both will use the same definitions (and build.rs script). That means a lot of unnecessary optimizations are happening in the much larger (comparatively) host app. Embedded Rust does a lot of weird non-standard (necessary) things that end up playing havoc with "regular" applications.
  2. I'm not sure per-package-target is meant to be used the way the workspace is setting up for it. After reading through the docs, it looks like per-package-target is more intended for having a single package distributed across different architectures. It looks more like a way to control features per target environment for a single app rather than a way to build packages that have different functionalities.

It's likely the issue will pop up again if you want to start adding feature flags or other more advanced workspace management stuff.

Personally, I like the two repo approach (CLI utility + Firmware). I've used it in the past and I've found it generally easier to manage the projects that way. But I also understand wanting to keep everything in a monorepo. Either way, I do think two workspaces are needed.

Bonus News As a side effect, the change should reduce your GHA build times by half. At least it did in my fork. :man_shrugging:

jdswensen commented 1 year ago

I also tested making a small change in the main program and the build time is around 17 seconds for me. The incremental builds and thin lto would help with that. Changing those values for that test brings it down to 8 seconds.

JohnAZoidberg commented 1 year ago

Thanks a lot for the investigation! Sorry been busy with all the newly announced projects :D

The change here looks good, let's do that for now. Please resolve the conflicts and I can merge. I'm not able to push to your branch.

I think for such a small project using sccache and messing with linker/compiler options is a bit of an overkill. Disabling optimizations also won't work because that makes it run too slowly on the RP2040 and can't keep up with the required USB timings (ask me how I know :smile:). Seems we'll have to split the workspace (but still somehow share the data types at least). I'd like to go with a single repo first, perhaps separate later.

jdswensen commented 1 year ago

Sorry been busy with all the newly announced projects

No worries, totally get it! Rebased and should be good to go now.

Disabling optimizations also won't work because that makes it run too slowly on the RP2040 and can't keep up with the required USB timings (ask me how I know 😄).

Totally agree! I just wanted to document the significance of making the workspace changes. 😄

JohnAZoidberg commented 1 year ago

Perfect, thanks a lot!

jscatena88 commented 1 year ago

Hey I just happened to be poking around in this Repo and noticed this issue/MR. In my group we make use of cargo-make to solve this problem. I'd love to point to an example but our code is not open-source. Essentially though you can define a Makefile.toml in each package that overrides the default build task to build for a specific target or with specific optimizations. Then if you run cargo make in the root of the workspace it will go into each package and build according to that package's makefile. You can also make a workspace level Make file that serves as the default with each package only overriding what they need to. If you are interested I could put together a branch with a rough outline of what it could look like

JohnAZoidberg commented 1 year ago

Hi @jscatena88! I have seen cargo-make but haven't had the time to explore it yet. Sure, you're welcome to show me how to improve our workspace build. :) If not, I'll try it eventually. Thanks!