Open Googulator opened 11 months ago
I need to think a bit more about this, and what place this has in live-bootstrap. Some initial thoughts;
This certainly wouldn't be a requirement for live-bootstrap per se, especially since bootstrapping has many other use cases besides just preventing Trusting-Trust-style attacks. In fact, one of the neat things about performing a "more trusted" bootstrap like this is that you can then propagate that higher level of trust to other, potentially more conventional systems - if you create the media for another bootstrap on a system that was bootstrapped with this process, that media itself is inherently free of a propagating backdoor (assuming clean hardware), even if it's not of the special design.
Builder-hex0 would still be able to run with my modifications on ordinary hardware, it would just print some additional logs - logs that become very useful for verifying the process. especially if the custom hardware is used (but even without it, if you already trust your image writing setup). It will just immediately detect that stage 2 is available, and skip the "waiting for stage 2" part.
While verifying builder-hex0 on the flash is a nice goal, I'm just concerned that it could be hard to verify the randomness of the data and make sure that
Given the existing hardware requirements for this, I think the better option would be to have the verification done externally, potentially with TTL circuits. Attack 2 would also give a reason not to fill the boot sector with random data.
The idea would be to actually use random data of your own choice, e.g. data prepared in advance from a known true random source such as a Geiger counter.
If you e.g. run 5 bootstraps - one with true random data from a quantum source, one with all zeros, one with digits of pi, one with the opening lines from a Shakespeare play, and one with your SSH public key; all on diverse hardware -, and they all reproduce the random or chosen data from their session on screen/log, and then bootstrap to identical binary output, then I believe we can exclude the possibility of functionally identical backdoors hiding in all of the seeds.
A key element to the trustworthiness of the bootstrap process is the small size of the initial binary seed - it's difficult if not downright impossible to hide self-propagating malware capable of compromising the process in 512 bytes, while still maintaining the apparent normal functionality of builder-hex0 stage 1. This can be further enhanced if stage 1 is modified to print out the hex0 source code of stage 2 as it's being compiled, and then stage 2 is likewise modified to print out any readable source code files it loads into srcfs. (Tarballs need special consideration - these would be printed out by untar instead of builder-hex0.)
This way, every byte of code is printed out before it's even compiled, let alone executed - with the exception of the initial binary seed. (Anything else would enable the executed code to modify its own source, and potentially hide malicious functionality from an auditor.) The printout can be securely recorded, e.g. via analog means, and then later reviewed and audited.
However, with normal storage devices, it's really difficult to verify whether the amount of data initially loaded is truly just 512 bytes. A compromised host system, for example, might create boot media with a larger preloader, that emulates the visible behavior of the original 512-byte seed, but secretly does nefarious acts in the background. And as we know from Ken Thompson's talk & paper, it's certainly possible to also subvert analysis tools on the host system, as well as in the compromised bootstrap environment, to not show this preloader or its activity when analysis is attempted.
A way around this is to build the storage device in a special way: with a mechanical switch to limit accessible size to 512 bytes. This should be the direct mechanical and electronic action of the switch, not dependent on firmware running on a microcontroller: in the canonical example, the switch disconnects address lines A9 and higher from a parallel, byte-addressable storage device, so that byte address 512 wraps around to 0. Another way to achieve a similar effect would be hot-swapping different sized SPI flash devices.
To further secure this setup, I'd suggest the following enhancements to stage 1:
When bootstrapping on bare metal from such media:
With such a setup, to achieve a compromise, a threat actor would have to encode a backdoor in 512 bytes in such a way to leave room for all the information needed to print out the true, uncompromised stage 1, including the random data used as filler - infeasible even if some form of compression is used.