Open keith-zephyr opened 2 years ago
cc @marc-hb
This workflow will be run on every PR.
I do not expect this to be something that breaks often. Bi-weekly build should be fine.
We have many non-locked Python dependencies that are used somehow during the build process, they should be considered.
Here's a list of 20+ old reproducibility fixes:
This should show what the most common problems are.
In the same place there's an (obsolete) test script. The approach was crude but very effective:
I do not expect this to be something that breaks often.
Agreed. Reproducibility testing and fixing is rare, but reproducibility regressions are very rare too.
Bi-weekly build should be fine.
On the other hand, IF it's cheap and quick to run then why not run it every PR?
Because of the amount of generated code, I'm in favor of checking on every PR. Maybe the github workflow can be setup to run on any changes to the ./scripts directory, but also setup as a weekly run to catch problems with the actual source code.
These 2 additional lines are IMHO a big step forward, please help review:
Github Actions for the Zephyr+SOF project have been routinely and successfully comparing binaries built on Linux versus Windows in every PR for a few months now:
To achieve this I overrode the default config change in #51954 in an SOF-specific way: https://github.com/thesofproject/sof/commit/945adb8d1660ed4
Building across two different operating systems provides a lot of differences "for free" that can be very difficult to achieve on the same operating system (see old #14593 attempt). Kudos to @aborisovich for implementing the Windows build in Github Actions.
This does not catch everything (e.g.: __DATE__
) but it indirectly provides reproducibility coverage for a lot of the Zephyr project.
Note a build is no more "reproducible" than a project is "bug-free"; fixing reproducibility bugs is a continuous activity exactly like fixing other bugs. Typically, building some code is reproducible in some Kconfiguration but fails when that Kconfiguration is changed - exactly like other bugs. Most recent example with CONFIG_ASSERT:
Switching to an old toolchain can also be very problematic:
Introduction
Zephyr builds should be reproducible. A checkout of Zephyr from the same commit, built with the same toolchain, should generate an identical image binary.
Problem description
This has been proposed before (https://github.com/zephyrproject-rtos/zephyr/pull/11523 and https://github.com/zephyrproject-rtos/zephyr/pull/14593). But there are no tests that verify reproducible build in the Zephyr tree at the moment.
Furthermore, reproducible builds were broken for an unknown amount of time, but fixed with https://github.com/zephyrproject-rtos/zephyr/pull/48195.
Proposed change
Add a new github workflow that verifies builds are reproducible. This workflow will be run on every PR.
The workflow can follow the blueprint of the Footprint Delta workflow. The new workflow would build TBD platforms, back to back, verifying the resulting binaries are identical.
Note that the build command
west build -b native_posix tests/drivers/build_all/sensor
has been known to catch problems with devicetree generation that results in non-reproducible builds.Dependencies
The new github workflow will block new PRs if the reproducible build test fails.
Concerns and Unresolved Questions
Running this check against every PR will incur additional computing time and resources.
Alternatives
Run the reproducible build check less frequently, such as nightly. However, this will require a significant bisect effort to identify the culprit PR when any failures are detected. The incremental cost of some additional builds on each PR seems worth the trouble.