Closed liangzhou121 closed 2 years ago
Looks good. A few notes, mainly for future reference.
First, eventlog-rs
won't be used by the SNP module because the SNP evidence doesn't include any kind of event log (although at some point we might be using vTPMs with SNP, which could provide something like that). Instead the evidence will include some other reference to the firmware, kernel, etc. that the VM booted with. It's not yet clear exactly what this will look like.
I don't think we want to have a boolean verification state. Will rego allow us to return a verification result that can be set to more than just two values? Similarly can the allow
field in the verification result be changed to something like verification_result
which could be set to more than just two different values. This would allow us to create more expressive key policies. For example, we could make a policy that only releases keys when a new version of firmware is used in the guest. It would also help reconcile different architectures where approved evidence might come with different security properties.
Finally, maybe we should think about having a Keylime attestation proxy?
@fitzthum Thank you for your comments, the following are my answers.
First, eventlog-rs won't be used by the SNP module because the SNP evidence doesn't include any kind of event log (although at some point we might be using vTPMs with SNP, which could provide something like that). Instead the evidence will include some other reference to the firmware, kernel, etc. that the VM booted with. It's not yet clear exactly what this will look like.
I think it's not a problem here. And maybe a new general separate libray which is compatible with both TDX eventlog and SEV-SNP should be added in future. But I am not sure about this currently.
I don't think we want to have a boolean verification state. Will rego allow us to return a verification result that can be set to more than just two values?
Yes, the OPA can return multiple evaluation values simultaneously. It's all depends on the policy.rego
file. You can play this OPA example to have a try, it will output the following:
{
"allow": true,
"user_is_admin": true,
"user_is_granted": []
}
Finally, maybe we should think about having a Keylime attestation proxy?
Yes, the design is modularized, so it's easy to extend to add some other proxy sub-modules. I think in the POC stage, maybe none proxy sub-module will be supported actually.
wouldn't the message format for SEV(-ES) and SEV-SNP need to considered in the goals now, for it to be accommodated later?
I think the assumption is basically that SEV-SNP is going to be similar enough to TDX (with some of the differences I mentioned above) that it should work with this proposal. On the other hand SEV-ES is different enough that it is never going to be a great fit so we will either just try to shoe-horn it in later or leave it to simple-kbs to support pre-attestation.
I know there was some interest in discussing SNP support more generally. We might want a separate issue for that.
One thing that I alluded to above is that SNP does not have the same kind of event log that TDX does. I am assuming that for SNP we are going to continue to use the measured-direct-boot approach that we are currently using for SEV(-ES). To simplify validation of the measurement we're going to need to provide some hints to the KBS/AS as part of the evidence. For example, we will probably want to send the hash of the kernel and initrd as part of the evidence. We will also probably need to include the CPU count of the guest so that the AS can construct an appropriate VMSA. These values can be evaluated using a rego policy. If the policy is valid we will construct an expected measurement from that and compare it to the measurement in the attestation report.
One interesting thing is that the AS will likely need access to the full firmware binary so that it can calculate the launch measurement (unless we try to pre-compute launch measurements offline). This is a bit annoying. The proposal above has some tooling to provide reference data to the AS. It looks like this is more aimed at providing hash, but we would probably use it to provide the full firmware binary as well.
So like I said, I believe that this proposal will work with SNP, but if someone wants to try filling in more details that would definitely be welcome.
Hi @vbatts Thank you for your comments.
wouldn't the message format for SEV(-ES) and SEV-SNP need to considered in the goals now, for it to be accommodated later?
I think the SEV-SNP should be considered now, we should support these runtime attestation simultanesously. And for SEV(-ES), as I mentioned in the OPEN section, maybe we can consider it later, because it has been supported by simple-kbs.
Hi @fitzthum ,
One thing that I alluded to above is that SNP does not have the same kind of event log that TDX does. I am assuming that for SNP we are going to continue to use the measured-direct-boot approach that we are currently using for SEV(-ES)
OK. And according to the design, each type of HW-TEE has its own corresponding module, so it can implement it's special and separate .rego
file (such as policy-snp.rego
) and the verification logic. So I think it's not a problem here.
One interesting thing is that the AS will likely need access to the full firmware binary so that it can calculate the launch measurement (unless we try to pre-compute launch measurements offline). This is a bit annoying.
Do you know how to calculate the SNP launch measurement? As example, the launch measurement includes: kernel, bootloader, firmware hashes. So the measurement is calculated simplify by: kernel hash + bootloader hash + firmware hash ?
Do you know how to calculate the SNP launch measurement?
Calculating the launch measurement for SNP is actually a bit more complicated than calculating the launch measurement for SEV-SNP. There is a reference implementation here and @dubek can probably answer any specific questions.
The basics are that instead of just hashing flash0 + VMSA, we actually hash every memory page individually, with each page extending the hash of the previous page. There are also some new mechanisms with different types of pages. The underlying memory that we load into flash0 and the VMSA are roughly the same as with SEV(-ES) though. So just like for SEV(-ES) we will need to know the hash of the initrd, kernel, and cmdline so that we can inject them into the firmware (this is how measured-direct-boot works). We will also need to generate the VMSAs (initial register state of guest) and we'll need to know a few things to do this like the number of CPUs the guest is running.
Calculating the launch measurement for SNP is actually a bit more complicated than calculating the launch measurement for SEV-SNP. There is a reference implementation here and @dubek can probably answer any specific questions.
@fitzthum Thank you for this information. And I also updated the proposal to mention that only TDX supports the eventlog mechansim currently.
@liangzhou121 thank you for the proposal, it looks great. I have a question regarding the Attestation Results
as proposed we are going with Entity Attestation Token(EAT) format. Is there any particular reason for using JWT
as typ
rather than CBOR Web Token (CWT)
. JWT is not inherently secure. It doesn't concern it with encryption, it only care about validation.
@knrt10 Thank you for your comments.
Is there any particular reason for using JWT as typ rather than CBOR Web Token (CWT). JWT is not inherently secure. It doesn't concern it with encryption, it only care about validation.
Actually both the JWT and CWT are recommended by RATS that can be used to transfer the Attestation Results. And the Attetsation Results only needs the integrity protection, so we think the JWT should be enough. By the way, none of the JWT/CWT will be used during the POC stage. Because the attestation service will be used as a local service or a Crate which is intergrated by KBS directly in this stage.
@liangzhou121 - can I confirm that "MAA" refers to "Microsoft Azure Attestation" in the diagram?
@liangzhou121 - can I confirm that "MAA" refers to "Microsoft Azure Attestation" in the diagram?
Yes, it means the "Microsoft Azure Attestation", I refer it here to show as an example that the AS can be extended to support potential third-party Attestation Service.
Thanks @liangzhou121.
Design
According to [RFC] Generic Key Broker Service (KBS) & Attestation Service high level architecture proposal, CC's general Attestation system relies on Attestation Service and KBS to verify the Attestation-Agent's running environment and distribute critical messages(such as KEK). The KBS is the role of RATS
Relying Party
, it's hardware agnostically and focuses on vendor-specific requirements. The Attestation Service is an implementation of the RATSVerifier
role, it focuses on the verification of Evidence's identity and TCB status.And the CC's Attestation Service is a general infrastructure service that mostly focuses on providing a Evidence verification service to KBS. So it needs to be compatible with all CC-supported HW-TEEs and should have excellent scalability to support potential new HW-TEE in the future. It also can be compatible with existing third-party attestation services with modularization.
To support the verification of TCB status, Attestation Service has the capability of parse the received Evidence to extract its corresponding TCB status. At the same time, it also includes the Policy Engine component which is an implementation of RATS
Verifier Owner
, It relies on the reference data fromReference Value Provider
and the info extracted from Evidence to match the Attestation-Agent's running environment's TCB status dynamically. And generate the corresponding Attestation Results.Attestation Service supports different deployment modes, as a service or as a library. One Attestation Service maybe needs to support several KBS services simultaneously when it is deployed as an infrastructure service. This will enhance the flexibility of the service's deployment.
Goals
Reference Value Provider
and set other configurations.Non-Goals
Architecture
The following diagram demonstrates the overall architecture and internal modules:
Endorser
to verify Evidence's identity.Verifier Owner
to verify TCB status.Reference Value Provider
to setVerifier Owner
reference data and execute some other configurations.Service Module
This module is used to communicate with external modules such as KBS, its mainly features are:
Evidence
sent by KBS and return the correspondingAttestation Results
.Evidence
Different types of HW-TEE Evidences are various, as example:
Attestation Results
The Entity Attestation Token (EAT) format compliant Attestation Results of the Evidence. Its format should be:
attestation_results
: The result of the Evidence's attestation.Signature
(Optional): The value will be empty in the following scenario:The
attestation_results
of different types of HW-TEE are various, as examples:As a Library
The Attestation Service will provide the following interface:
attestation()
:As a Service
The Attestation Service can be extended to add a separate service application which also relies on the upper library. Its
Cargo.toml
file example:Note: The service can be connected with
localhost:port
only in POC stage.The POC will select the gRPC, so its
proto
file should be:Attestation Module
Its responsibility is to verify these various received Evidence. It includes the following two internal modules:
It also defines a
trait
that needs to be implemented by its internal modules.Verification Drivers
It includes different HW-TEE specific pluggable verification modules and some common modules to verify the Evidence's identity and TCB status:
TDX/SGX/SEV(-ES)/SEV-SNP
Its features:
feature
, such as#[cfg(feature = "tdx")]
.Eventlog-rs
To verify all running programs' measurements inside TEE-based VM, the TCG protocol compliant eventlog will be included inside Evidence. So the eventlog-rs is used to parse that eventlog and extract the following components' measurements which will be sent to Policy Engine for final verification (as an example):
Note: Only the TDX supports eventlog currently.
Policy Engine
It uses Open Policy Engine to reach Develop Policy as Code. And OPA relies on the following files to execute verification:
Reference Value Provider
.Policy.rego
The different HW-TEE TCB status's verification policy is various, as examples:
Policy.rego
used to check Bootloader, Kernel Parameters, and Kernel measurements extracted from eventlog:import future.keywords.in default allow = false allow { bootloader_is_granted kernel_is_granted kernelparameters_is_granted }
bootloader_is_granted { count(data.bootloader.hashes) == 0 } bootloader_isgranted { input.bootloader == data.bootloader.hashes[] } kernel_is_granted { count(data.kernel.hashes) == 0 } kernel_isgranted { input.kernel == data.kernel.hashes[] } kernelparameters_is_granted { count(data.parameters.hashes) == 0 } kernelparameters_isgranted { input.parameters == data.parameters.hashes[] }
package policy
By default, deny requests.
default allow = false
allow { mrEnclave_is_grant mrSigner_is_grant input.productId >= data.productId input.svn >= data.svn }
mrEnclave_is_grant { count(data.mrEnclave) == 0 } mrEnclave_isgrant { count(data.mrEnclave) > 0 input.mrEnclave == data.mrEnclave[] }
mrSigner_is_grant { count(data.mrSigner) == 0 } mrSigner_isgrant { count(data.mrSigner) > 0 input.mrSigner == data.mrSigner[] }
As a Service
If the Attestation Service is built as a service, it will provide these functions as GRPC services. So its
proto
file should be:Opens
Reference