confidential-containers / documentation

Documentation for the confidential containers project
Apache License 2.0
73 stars 48 forks source link

[RFC] Proposal for Measured Boot Image/Rootfs #40

Closed arronwy closed 2 years ago

arronwy commented 2 years ago

Motivation

The CoCo stack utilize hardware-based TEE technology for runtime protection, and HWTEE will provide boot measurement feature to ensure the integrity of the runtime stack. The measured data will be used by attestation-agent as evidence to remote attestation service to verify the integrity of our CoCo runtime. Some key components like boot firmware, guest kernel, kernel cmdline are already included in the default measurement scope, but for guest boot image, this is the potential GAP, since it is big >100M and we need consider the boot performance and memory footprint during the measurement. Thanks @jiazhang0 @jiangliu , they give lots of valuable suggestions on this proposal.

Design

We may follow the measurement process like other components as below: image

But as we described before, guest boot image is big, it's time consuming and take lots of memory for load and measurement, we proposal to utilize kernel integrity features to protect the integrity of boot image as below: image

Kernel may support integrity protection feature at block level or filesystem level, we can allow the user to configure, for read only block device we can utilize dm-verity features to provide transparent integrity checking. In kernel cmdline, user can define below to enable integrity protection for the rootfs and this config will be measured as part of the kernel cmdline:

cc_rootfs_verity.scheme=dm-verity cc_rootfs_verity.hash=894be17bc7f3bd73a386442efdd0080c28cedb0c5d6f01947ddbf01e080c1b43

Kernel only provided the integrity features, but still depends on the user space tools to do the setup and initialization work, we will package related tools and scripts into the initramfs and embedded into the kernel image with CONFIG_INITRAMFS_SOURCE config, then these user space tools and scripts will be measured together with the kernel image too.

For initial implementation for CoCo, we will only implement the dm-verity scheme to provide the block-based readonly protection for the boot image. In future, another approach such as filesystem-based protection maybe involved.

Introduction of dm-verity

Device-mapper is infrastructure in the Linux kernel that provides a generic way to create virtual layers of block devices. Device-mapper verity target provides read-only transparent integrity checking of block devices using kernel crypto API. dm-verity uses a tree of sha256 hashes to verify blocks as they are read from a block device. This ensures files have not changed between reboots or during runtime. This is useful for monitoring the unauthorized changes to root. Verity devices are regular block devices which can be accessed in /dev/mapper.

The device-mapper verity target device has two underlying devices, one is data device which used to store actual data, the other is hash device which used to store hash tree data will be used to verify the data integrity of data device.

image (The picture is from: https://source.android.com/security/verifiedboot/dm-verity)

Preparing hash device

veritysetup format —salt=189dd819573ca746d5145677e3b04fb0ce76a5ccbb13b95db55c6967da9b59ab /dev/loop2 /dev/loop3

VERITY header information for hash.img UUID: 0f23169d-c31c-4c5b-8127-751ad324321e Hash type: 1 Data blocks: 512 Data block size: 4096 Hash block size: 4096 Hash algorithm: sha256 Salt: 189dd819573ca746d5145677e3b04fb0ce76a5ccbb13b95db55c6967da9b59ab Root hash: 389d79f4b06a427dff6bba2a4376a4200d3b02fa3e92a6a67b69bc57c9851789

Activation of verity data device

veritysetup open /dev/loop2 vroot /dev/loop3 389d79f4b06a427dff6bba2a4376a4200d3b02fa3e92a6a67b69bc57c9851789

* Integrate hash device with data device
`veritysetup` will generate a `dm-verity` hash tree during format operation, it can write the hash table to a file or block devices directly. 

Kata will pass the rootfs as one block device to the guest kernel like below:

root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4

To compatiable with current Kata API, we can add the hash tree to the boot image too, we have two options:
1. Add the hash tree to the same block device with rootfs, we need ensure the rootfs have enough space to hold the hash tree and verify metadata block(32k) after the last rootfs block

veritysetup --hash-offset 209747968 --data-blocks 51200 format boot.image boot.image veritysetup --hash-offset 209747968 --data-blocks 51200 verify /dev/pmem0p1 /dev/pmem0p1 c5daed18358f92153f3b9bc38f12f1e8bb66d3ba3cd2f9f46cab2b1b36a4bf7a

2. Add the hash tree as a seperate block device

veritysetup format boot.image hash.image

add the hash.image as an seperate block device in boot.image

veritysetup verify /dev/pmem0p1 /dev/pmem0p2 c5daed18358f92153f3b9bc38f12f1e8bb66d3ba3cd2f9f46cab2b1b36a4bf7a


I did not try the method 2 yet, will give more info once tried.

## Deploy solutions
* Setup stage
Generate boot image
Using `veritysetup` tool to create data block and hash block and integrate into the boot image
Generate `veritysetup` required kernel cmdline parametes(cc_rootfs_verity.scheme, cc_rootfs_verity.hash, ...)
Generate initramfs with static build `veritysetup` and `init.sh` to setup the `dm-verity` target device

* Runtime stage
Kernel boot initramfs
The init.sh in initramfs parse the kernel cmdline and setup the `dm-verity` target device
Mount the `dm-verity` target device as rootfs and `switch_root` to the init daemon in rootfs

This is the demo link:
https://asciinema.org/a/cSDgf2jeaRClmykjXNlPyVvPK

## Devel Plan for CoCo
* [x] Guest kernel support dm-verity
* [x] Static build `veritysetup` and `init.sh` to generate the initramfs image
* [x] Integrate initramfs with guest kernel
* [x] Generate rootfs verification data to hash device which can be as part of the roofs image with different sector
* [x] Save the hash of the root node to the kata configure file

## References
https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/verity.html
https://source.android.com/security/verifiedboot/dm-verity
https://wiki.gentoo.org/wiki/Custom_Initramfs
jiazhang0 commented 2 years ago

Nice proposal!

First, we need to emphasize that block-based readonly protection scheme dm-verity is only the initial implementation for cc. In future, better approach such as filesystem-based protection will be involved.

In dm-verity, hash device and data device can be separate or combined together. We'd better determine it now. In addition, people may care how dm-verity can defense the possible online/offline attacks to hash or/and data device, and which attacks cannot be protected in cc's use model. In other word, the info about attack vectors against cc is not described in details.

arronwy commented 2 years ago

Nice proposal!

First, we need to emphasize that block-based readonly protection scheme dm-verity is only the initial implementation for cc. In future, better approach such as filesystem-based protection will be involved.

Yes, I'll add this in the design part.

In dm-verity, hash device and data device can be separate or combined together. We'd better determine it now.

Agree, I'll add one section to discuss this part, I prefer to combine them together, but there still have two options for how to combine them together.

In addition, people may care how dm-verity can defense the possible online/offline attacks to hash or/and data device, and which attacks cannot be protected in cc's use model. In other word, the info about attack vectors against cc is not described in details.

Yes, I'll add these to the threat vectos part.

sameo commented 2 years ago

Thanks for the proposal @arronwy. I have a few comments and questions:

mythi commented 2 years ago

First, we need to emphasize that block-based readonly protection scheme dm-verity is only the initial implementation for cc. In future, better approach such as filesystem-based protection will be involved.

Yes, I'll add this in the design part.

I read that as there are gaps with dm-verity when some better approach is needed in the future. What are the gaps?

arronwy commented 2 years ago

Thanks for the proposal @arronwy. I have a few comments and questions:

  • I assume the dm-verity table is signed? Which key is used, and how is the verification done by the kernel? That's similar to what @jiazhang0 is asking: How do we protect the guest boot image from e.g. the host tampering with it before passing it to the kata runtime? Are we storing the dm-verity table in a measured block device that's part of the attestation evidence?

No, in our proposal, we don't sign the dm-verity table. We use the root hash to protect the guest boot image, the root hash will be part of the kernel cmdline measurement and be part of the attestation evidence, this is the reason we named measured boot image. When the attacker modify the data device or hash device, when use access the data, the top level root hash will be changed.

  • Is the boot image measurement still included in the attestation evidence?

Yes, the root hash is part of kernel cmdline and will be measured as attestation evidence.

  • The advantages of dm-verity over a complete measurement of the disk at once is not (at least to me) clearly describes here. How does using dm-verity significantly decrease the boot time latency that a complete measurement would add while keeping the same security level?

Yes, I'll add the performance section, dm-verity only measure and verify when data is accessed, this ondemand design and many other designs like prefetch can greatly improve the performance.

arronwy commented 2 years ago

First, we need to emphasize that block-based readonly protection scheme dm-verity is only the initial implementation for cc. In future, better approach such as filesystem-based protection will be involved.

Yes, I'll add this in the design part.

I read that as there are gaps with dm-verity when some better approach is needed in the future. What are the gaps?

Current dm-verity approach can meet our requirements, in the future, when some filesystem level integrity features is mature, we can also support them.

sameo commented 2 years ago

Thanks for the proposal @arronwy. I have a few comments and questions:

  • I assume the dm-verity table is signed? Which key is used, and how is the verification done by the kernel? That's similar to what @jiazhang0 is asking: How do we protect the guest boot image from e.g. the host tampering with it before passing it to the kata runtime? Are we storing the dm-verity table in a measured block device that's part of the attestation evidence?

No, in our proposal, we don't sign the dm-verity table. We use the root hash to protect the guest boot image, the root hash will be part of the kernel cmdline measurement and be part of the attestation evidence, this is the reason we named measured boot image. When the attacker modify the data device or hash device, when use access the data, the top level root hash will be changed.

As long as the root hash is measured and compared with a reference value for it, we should be good. The other attack vector is for the host to change the guest boot image, generate a new hash, and modify the guest kernel command line accordingly. That will let the guest boot, but attestation will fail.

It would be good to describe that in the proposal.

Another question: What's the kernel version requirement to support dm-verity?

  • Is the boot image measurement still included in the attestation evidence?

Yes, the root hash is part of kernel cmdline and will be measured as attestation evidence.

+1

  • The advantages of dm-verity over a complete measurement of the disk at once is not (at least to me) clearly describes here. How does using dm-verity significantly decrease the boot time latency that a complete measurement would add while keeping the same security level?

Yes, I'll add the performance section, dm-verity only measure and verify when data is accessed, this ondemand design and many other designs like prefetch can greatly improve the performance.

Yes, and that's a great proposal, thanks a lot. We just need to call it out, and make sure people with no dm-verity knowlege understand what we get from it.

arronwy commented 2 years ago

Thanks for the proposal @arronwy. I have a few comments and questions:

  • I assume the dm-verity table is signed? Which key is used, and how is the verification done by the kernel? That's similar to what @jiazhang0 is asking: How do we protect the guest boot image from e.g. the host tampering with it before passing it to the kata runtime? Are we storing the dm-verity table in a measured block device that's part of the attestation evidence?

No, in our proposal, we don't sign the dm-verity table. We use the root hash to protect the guest boot image, the root hash will be part of the kernel cmdline measurement and be part of the attestation evidence, this is the reason we named measured boot image. When the attacker modify the data device or hash device, when use access the data, the top level root hash will be changed.

As long as the root hash is measured and compared with a reference value for it, we should be good. The other attack vector is for the host to change the guest boot image, generate a new hash, and modify the guest kernel command line accordingly. That will let the guest boot, but attestation will fail.

It would be good to describe that in the proposal.

Yes,thanks, I'll add these info.

Another question: What's the kernel version requirement to support dm-verity?

Depends on the features, dm-verity was introduced into the Linux kernel in version 3.4, to support all specific features may require version 4.17.

  • Is the boot image measurement still included in the attestation evidence?

Yes, the root hash is part of kernel cmdline and will be measured as attestation evidence.

+1

  • The advantages of dm-verity over a complete measurement of the disk at once is not (at least to me) clearly describes here. How does using dm-verity significantly decrease the boot time latency that a complete measurement would add while keeping the same security level?

Yes, I'll add the performance section, dm-verity only measure and verify when data is accessed, this ondemand design and many other designs like prefetch can greatly improve the performance.

Yes, and that's a great proposal, thanks a lot. We just need to call it out, and make sure people with no dm-verity knowlege understand what we get from it.

Thanks, I'll add more detail info to the dm-verity part.

jiazhang0 commented 2 years ago

Let me clarify the details about how dm-verity provides the measurement and how it associates with attestation. I think @sameo and @arronwy have already reach out to this point.

When guest fw is launching kernel, it will measure kernel command line. Assuming the command line contains the following content about dm-verity:

cc_rootfs_verity.scheme=dm-verity cc_rootfs_verity.hash=894be17bc7f3bd73a386442efdd0080c28cedb0c5d6f01947ddbf01e080c1b43

Then kernel boots up and then automatically run the integrated initramfs entrypoint script init.sh which execute veritysetup with root hash grabbed from kernel command line to mount dm-verity rootfs. The whole behaviors were measured. In other words, there is no actions of measuring the entire boot image. It is a zero cost paid by us for cc boot image protection, and especially useful for the use cases where the workload provider would not like to deploy a protection boot image by self. In addition, there is a "root_hash_sig_key_desc" parameter used during mounting dm-verity rootfs to provide a signature verification for root hash. This sounds reasonable, but it is not useful for some use cases. Essentially, the signature is signed by a user key which chains to a trusted certificate embedded in guest kernel. In the use cases where the workload provider would not like to deploy a protection boot image by self, the trusted certificate generated by CSP is not trusted by attestation service employed by workload provider, after all currently there is no good approach to dynamically embed it to guest kernel provided by CSP, or externally load it to guest kernel as a trusted certificate.

When launching the initial attestation, the related measurements in evidence including kernel command line with dm-verity root hash are collected and sent to attestation-service.

屏幕快照 2022-07-28 上午11 31 54

In attestation-service side, the provenance of rootfs is retrieved from supply chain to RVPS which can generate the reference value to check against the root hash of rootfs embedded in kernel command line.

dm-verity can defense the following attack vectors:

dm-verity cannot defense the following attack vectors:

So cc solution with dm-verity needs to protect against the possible attack which can tamper the hash blocks, data blocks and kernel commandline simultaneously. In addition, dm-verity mechanism cannot be bypassed/disabled through tampering kernel command line in cc solution. So providing cc_rootfs_verity parameters in cc kernel command line must be enforced.

During runtime, the data blocks are always verified with hash blocks, and the verification results and layer 0 hash blocks are cached in memory. In addition, the hash tree walk only happens in the initial access to a data block. As time goes on, a memory threshold for dm subsystem may reach, leading to the cached layer 0 hash blocks are partially or completely released. At this time, the access to the data block that the corresponding hash block has been dropped will trigger a new hash tree walk. This design can explain why dm-verity can partially defense online tampering the hash blocks because the tampered hash block is not reloaded from disk unless the cached hash blocks are dropped.

mythi commented 2 years ago

@jiazhang0 great insights, thanks!

So providing cc_rootfs_verity parameters in cc kernel command line must be enforced.

You talked about the option for "workload provider would not like to deploy a protection boot image by self". Do you mean completely different image without dm-verity or the same image but just not with it used (as it was in the demo video)?

If it's the former, then I think the "root_hash_sig_key_desc" approach would still work (the table signature verification would not be used when dm-verity is not used)

jiazhang0 commented 2 years ago

@mythi Let me clarify the background.

"workload provider would not like to deploy a protection boot image by self" is a solution level affair. For example, a CSP provides the service for confidential container instance. In its instance configuration page, there is no option used to specify guest os image or upload a custom guest os image. In this case, CSP will provide and deploy the guest os image (and guest fw, kernel, initrd, kata-agent, attestation-agent and other artifacts deployed to Pod by CSP) for Pod (TEE). Is it trusted? No, it is just unverified. Need an attestation to give the result between trusted or untrusted. So in this case workload provider who deploys the container image would need the attestation to verify the root hash of the guest os image (of course other claims about various artifacts) deployed by CSP and then determine whether provisioning the secret to Pod.

Alternately, a solution may allow workload provider to upload a custom guest os image. This image can be encrypted with dm-crypt or dm-integrity. Maybe dm-verity is not used in this solution.

mythi commented 2 years ago

"workload provider would not like to deploy a protection boot image by self" is a solution level affair. For example, a CSP provides

I understood this so that you for some reason wanted to keep the option to not use dm-verity but that does not seem to be the case.

kergon commented 2 years ago

As an additional layer of ongoing protection, consider sending device-mapper measurements to the remote attestation service: https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/dm-ima.html

kergon commented 2 years ago

Also keep an eye on LoadPin. https://listman.redhat.com/archives/dm-devel/2022-July/051520.html

bodzhang commented 2 years ago

@arronwy, @jiazhang0, @jiangliu, great proposal and writeup!

In past projects, we used a separate block device for the dm-verity hashtree. I think it's the more traditional approach for Linux dm-verity disk mount.

arronwy commented 2 years ago

@arronwy, @jiazhang0, @jiangliu, great proposal and writeup!

In past projects, we used a separate block device for the dm-verity hashtree. I think it's the more traditional approach for Linux dm-verity disk mount.

Hi @bodzhang , yes, it will be easy to track hashtree as a seperate block device, I just submitted the PR: https://github.com/kata-containers/kata-containers/pull/4967 to generate root hash as an separate partition for rootfs. Feel free to review or give feedbacks.

ariel-adam commented 2 years ago

@arronwy is this issue still relevant or can be closed? If it's still relevant to what release do you think we should map it to (mid-November, end-December, mid-February etc...)?

arronwy commented 2 years ago

@arronwy is this issue still relevant or can be closed? If it's still relevant to what release do you think we should map it to (mid-November, end-December, mid-February etc...)?

Hi @ariel-adam , this is still relevant and will be target in mid-November

sameo commented 2 years ago

@arronwy What are the pending PRs for that feature? Also, there will be some operator dependencies to enable it?

arronwy commented 2 years ago

@arronwy What are the pending PRs for that feature? Also, there will be some operator dependencies to enable it?

Hi @sameo These are the pending PRs: https://github.com/kata-containers/kata-containers/pull/5136 https://github.com/kata-containers/kata-containers/pull/5149 https://github.com/kata-containers/kata-containers/pull/5169

This feature has no HW dependency, for operator, it only need decide whether to enable it by default and set the right config in kata configuration.toml file

fidencio commented 2 years ago

I'm closing this one as we merged all pending PRs.