RFE: Make it possible to 'predict' PCR[7] upon shim update

rhboot / shim

UEFI shim loader

Other

819 stars 289 forks source link

RFE: Make it possible to 'predict' PCR[7] upon shim update #555

Closed vittyvk closed 11 months ago

vittyvk commented 1 year ago

Shim extends PCR7 with certain builtin objects, e.g. vendor_db, MokList, vendor_cert, shim_cert, SBAT and these objects may change if shim gets updated. In case there are some objects (e.g. credentials) in the system, TPM sealed against PCR7 (or a set of PCRs including PCR7), these objects require re-sealing against the expected post-reboot PCR state. It is, however, quite hard to extract the required information from shim binary.

The suggestion is to produce the list of hashes from built-in shim objects during shim package build and publish them in an easy to consume form, e.g. standalone JSON or YAML file. Distributions may package this new file significantly simplifying PCR prediction procedure upon shim package update.

Note, this is orthogonal to the system objects which get measured in PCR7, e.g. system DB, DBX,... as these objects are not known during shim package build. Re-sealing procedure may treat this objects as immutable and get the required hashes from TPM event log.

berrange commented 1 year ago

The suggestion is to produce the list of hashes from built-in shim objects during shim package build and publish them in an easy to consume form, e.g. standalone JSON or YAML file. Distributions may package this new file significantly simplifying PCR prediction procedure upon shim package update.

I think it is probably desirable to NOT use a standalone file, as it introduces the problem of having to find the metadata file that matches the binary you are interested in. I think it would be preferable to be able to extract the info from the shim.efi binary directly, avoiding the need to locate a corresponding metadata file. This could potentially just be a case of adding a specially named PE file section to shim, which contains the hash info in some particular format - it could still be JSON/YAML, just embedded in the PE binary.

vathpela commented 1 year ago

Just to be sure I understand, the scenario is that the system is installed, and you've got a new shim binary being installed as part of an update, as well as the tpm log from the current boot, and you want to examine the new binary (and/or just a corresponding metadata file) to tell which log entries it is replacing?

berrange commented 1 year ago

I'd say there are two main scenarios

We have a pristine disk image template that is going to be deployed as a VM instance. We want to be able to predict the PCR values that this VM will shortly boot with, so that the image can be encrypted prior to boot and the LUKS slot sealed against the TPM of the forthcoming VM. IOW, we have no boot tpm log, we're doing prediction entirely a-priori.
We have an existing running VM and have just deployed some software updates, which might imply the need to re-seal the host volume. So we need to predict new expected PCR values before we reboot into the new software. There is a TPM log available, but it (probably) isn't needed, since we'll already have code needed to predict everything from scratch, in order to satisfy the first scenario, except for any MOK entries the user added post-deployment.

vathpela commented 1 year ago

Yeah, okay I think that makes plenty of sense, and building it in as a PE section makes sense as well (though I suspect we'll want to emit a separate artifact for convenience as well).

That said, I do have a couple of (minor) concerns - the biggest concern is that we probably do need some kind of entry for the system variables, because when you don't have the log, you still need to know the order the extends happen in even if we don't know the value. My other (even more minor) thought is that my go-to for simplicity here is probably CSV rather than JSON or whatnot, though since we're merely embedding it it's less big a deal than if we have to consume it.

So, to me this basically looks like we should add a ".tpm" PE section that has something like:

artifact,pcr,sha1,sha256
dbx,7,-,-
vendor_dbx,7,04235840e0c745cc1e5754d7243cfcc5378b2760,aa6f0bfbc13ce44eb1df00e37bf28a6db8def368ac50b57fab3a7f9c0d3fc4e5
MokListX,7,f0c6360f5247b37696eef92701a55bda234332bb,40f641f1ddf5a22934ea7e12f5a00fe23192935030db646c395a5c8334b060d8
MokListX,14,f0c6360f5247b37696eef92701a55bda234332bb,40f641f1ddf5a22934ea7e12f5a00fe23192935030db646c395a5c8334b060d8
db,7,-,-
vendor_db,7,039de9e110148e9de642a80bea3fed921825c95f,beb7a19f6570207602b3cd0cccabc2e5541dd084f68af63c591b1e742ec4dc96
MokList,7,a59f29ae88d3044759117e7f264b039c3168bb2f,bf7f67aaff5bce377b7a750ee486e3a7eed7d9abf9d7434092625e6a2f8aaf12
MokList,14,a59f29ae88d3044759117e7f264b039c3168bb2f,bf7f67aaff5bce377b7a750ee486e3a7eed7d9abf9d7434092625e6a2f8aaf12

... and so forth, in the order the extends would happen and containing the hashes for those we're aware of. Does this seem more or less right to y'all?

berrange commented 1 year ago

the biggest concern is that we probably do need some kind of entry for the system variables, because when you don't have the log, you still need to know the order the extends happen in even if we don't know the value.

I'm not so fussed about the ordering of extends, as I feel like that isn't likely to change, and in any case it is just one small part of the much bigger global ordering problem for the whole boot sequence.

IOW, I've been pretty much accepting that ordering of PCR extends is something that's going to end up as code, rather than data driven. Or at least if it were data driven, that ordering might be expressed in a data file that describes the inputs for full OS boot sequence.

As an example, I've got a PoC python program that demonstrates one possible way to do PCR prediction. It uses two data files that describe the inputs for the guest image and the host(hypervisor) firmware

https://gitlab.com/berrange/tpm-prevision/-/blob/master/examples/image.yaml https://gitlab.com/berrange/tpm-prevision/-/blob/master/examples/ovmf-manifest.yml

While the ordering of specific extends is still defined by the code, the shim part of it is

https://gitlab.com/berrange/tpm-prevision/-/blob/master/tpm/prevision/precog.py#L251

could be interesting to think about how to make the ordering of PCRs be metadata driven across the whole OS boot sequence, but hadn't got as far as considering that.

My other (even more minor) thought is that my go-to for simplicity here is probably CSV rather than JSON or whatnot, though since we're merely embedding it it's less big a deal than if we have to consume it.

A CSV with just the hashes would work from a functional POV.

I wonder though if it is worth having a format with both the hashes and corresponding actual content. Debugging mis-matched PCR extends is incredibly difficult at the best of times, so if we exposed the source data as well the hashes, it might make life less painful. That might push direct something a bit more expressive than a CSV ?

vittyvk commented 1 year ago

Assuming we can do both PE section and a standalone artifact, these can be in different formats. I.e. PE section uses CSV for simplicity and the standalone file is a full fledged YAML with not only the resulting hashes but the actual content.

chrisccoulson commented 1 year ago

Shim only measures SbatLevel and the CAs used for authenticating things it loads to PCR7, and it's already largely possible to predict these values from a particular shim binary using the information available in the .vendor_cert and .sbatlevel sections. That's what we're doing on Ubuntu Core (although we don't yet parse .sbatlevel) - using those sections and the contents of db, it is possible to compute the values that shim will measure when it starts and when it authenticates other signed binaries. I'm not sure how the digests for those could be easily encoded in shim - eg, the SbatLevel measurement is a function of the current value of SbatLevel, the payload in shim (which can be read from the .sbatlevel section) and the value of SbatPolicy. The authentication measurements depend on what shim is loading.

There are still some gaps where it can be tricky - eg, sometimes fixes land that make subtle changes to measurements and it's not possible to identify these externally today. Eg, https://github.com/rhboot/shim/commit/e3325f8100f5a14e0684ff80290e53975de1a5d9 made a subtle change to the format of authentication measurements in some circumstances.

berrange commented 1 year ago

Shim only measures SbatLevel and the CAs used for authenticating things it loads to PCR7, and it's already largely possible to predict these values from a particular shim binary using the information available in the .vendor_cert and .sbatlevel sections. That's what we're doing on Ubuntu Core (although we don't yet parse .sbatlevel) - using those sections and the contents of db, it is possible to compute the values that shim will measure when it starts and when it authenticates other signed binaries.

Ah, I missed that there were already separate PE sections for this data, thanks for pointing that out.

I'm not sure how the digests for those could be easily encoded in shim - eg, the SbatLevel measurement is a function of the current value of SbatLevel, the payload in shim (which can be read from the .sbatlevel section) and the value of SbatPolicy. The authentication measurements depend on what shim is loading.

Yep, I think the raw data is more important than the hashes anyway, since the hashes are easily calculated once you have the raw data.

There are still some gaps where it can be tricky - eg, sometimes fixes land that make subtle changes to measurements and it's not possible to identify these externally today. Eg, https://github.com/rhboot/shim/commit/e3325f8100f5a14e0684ff80290e53975de1a5d9 made a subtle change to the format of authentication measurements in some circumstances.

I think those kind of cases could be handled by exposing a list of string "feature flags". eg that example could have a "authority-with-sig-owner" flag. If there were a PE section ".flags" that could expose a list of named feature flags. Any time the code does something that alters the measurements, add another flag. Apps that need to predict shim PCRs can look for the flags and alter their logic accordingly

vittyvk commented 11 months ago

Shim only measures SbatLevel and the CAs used for authenticating things it loads to PCR7, and it's already largely possible to predict these values from a particular shim binary using the information available in the .vendor_cert and .sbatlevel sections. That's what we're doing on Ubuntu Core (although we don't yet parse .sbatlevel) - using those sections and the contents of db, it is possible to compute the values that shim will measure when it starts and when it authenticates other signed binaries.

Ah, I missed that there were already separate PE sections for this data, thanks for pointing that out.

Indeed, I missed

commit 0eb07e11b20680200d3ce9c5bc59299121a75388
Author: Chris Coulson <chris.coulson@canonical.com>
Date:   Tue May 31 22:21:26 2022 +0100

    Make SBAT variable payload introspectable

as it is only in shim-15.7 but both Fedora and RHEL I've checked ship 15.6. I'll close this for now as it seems we already have the information available.