KBS implementation observations

mythi commented 11 months ago

While working with #159 and a bit of #151 too I made some observations I wanted to capture here for discussion.

Nonce from the attestation-service With the current flow KBS creates the nonce and it has a lifetime of the session (which is configurable in kbs parameters). AS does not verify the nonce is unique and fresh. Thus, it may be desirable to have the AS to generate the nonce with some lifetime so that it becomes easier to track if a nonce is valid for a one-time usage.
TEE runtime data/nonce hashing AFAICS, this is a "compatibility issue" as there seems to be no specification for how this should be done. CoCo Attesters SHA384 hash the nonce and parts of TeePubKey (Related: https://github.com/confidential-containers/guest-components/issues/366) and add it to the quote. Amber uses SHA256 and the lower 32 bytes of "report data" to verify so there's a gap. MAA seems to be doing something similar: "Azure Attestation will validate the provided “Quote”, and will then ensure that the SHA256 hash of the provided Enclave Held Data is expressed in the first 32 bytes of the reportData field in the quote." With CoCo Attesters/Verifiers being in our control there's no problem but what if the CoCo attester generated quote needs to be verified by some other AS?
Session timeout there seems to be a conflict with how long the attestation results are valid: KBS could set some long session timeout but the AS token can theoretically have much shorter lifetime which is ignored by the resource API (get_attest_claims_from_session()) and is left for the policy to check? Similarly, if 1. makes sense and the nonce has an expiration time from AS, auth could set the session timeout to follow that initially?
Token verification in several places JWT token verification happens in several places. Amber checks the public key based on the user input provided in the config where as the CoCo sample token verification uses the jwt claim in the token.

Xynnn007 commented 11 months ago

Thanks for bringing this up @mythi

Let me try to share some ideas upon the topics.

Nonce from the attestation-service With the current flow KBS creates the nonce and it has a lifetime of the session (which is configurable in kbs parameters). AS does not verify the nonce is used only once so it seems possible to cheat AS to continuously assign tokens when the TEE quote and KBS nonce are known. It may also be desirable to have the AS to generate the nonce with some lifetime so that it becomes easier to track if a nonce is valid for a one-time usage.

The nonce is part of the RCAR handshake protocol. I think it should be transparent to ASs, e.g. coco-AS, Amber and MAA. I got your point that we can reply with old KBS nonces and evidences to get tokens from those different ASs. My idea is KBS should consume those token inside and does not return anyone of them. We might need some re-design of the architecture. If the guest wants to directly get a token (like Amber token), it should directly handshake with Amber service rather than KBS.

2. TEE runtime data/nonce hashing AFAICS, this is a "compatibility issue" as there seems to be no specification for how this should be done. CoCo Attesters SHA384 hash the nonce and parts of TeePubKey (Related: kbs_protocol does not follow the KBS spec guest-components#366) and add it to the quote. Amber uses SHA256 and the lower 32 bytes of "report data" to verify so there's a gap. MAA seems to be doing something similar: "Azure Attestation will validate the provided “Quote”, and will then ensure that the SHA256 hash of the provided Enclave Held Data is expressed in the first 32 bytes of the reportData field in the quote." With CoCo Attesters/Verifiers being in our control there's no problem but what if the CoCo attester generated quote needs to be verified by some other AS?

Let me put PRs to both guest-components and AS side to add kty and alg fields to the seed of the hash to fix.
As I mentioned in Q1, the underlying AS does not need to care about the report data. We should change the API async fn verify(&mut self, tee: Tee, nonce: &str, attestation: &str) -> Result<String>; to async fn verify(&mut self, tee: Tee, tee_evidence: String) -> Result<Vec<u8>>; Here the returned value is report data. In this way KBS can check the binding of the report data in evidence with the expected one derived from nonce+pubkey. Note that Amber's own ability to verify the binding relationship between input data and report data is not used here. We only leverage the ability of verifying evidences of Amber and other ASes. This is feasible because the token returned by Amber contains the reportdata field, which can be extracted in the KBS plug-in code as the return value. So does MAA.

3. Session timeout there seems to be a conflict with how long the attestation results are valid: KBS could set some long session timeout but the AS token can theoretically have much shorter lifetime which is ignored by the resource API (get_attest_claims_from_session()) and is left for the policy to check? Similarly, if 1. makes sense and the nonce has an expiration time from AS, auth could set the session timeout to follow that initially?

If we agreed on both Q1 & Q2. Attestation results from coco-AS is not a problem, as it was consumed by KBS. I suggest to redesign the dataflow.

For KBS+AS

The returned token from KBS should be a credential that can be used to access the resources from KBS. I made a proposal for the format of the returned credential here. https://github.com/confidential-containers/kbs/issues/143

For only AS. If the tenant want to use AS token (Amber token, coco-AS token, MAA token...) for more aims, the client drivers can be integrated into Attestation Agent. The architecture can be

Here Amber can be coco-AS, MAA, etc. cc @sameo @fitzthum @jialez0 @Lu-Biao

mythi commented 11 months ago

@Xynnn007 no need to rush into any design changes just yet. I'm still working on to understand all the glitches with Amber.

AFAICS, the main problem is Amber does not understand sha384(nonce || tee-pubkey) reportdata to be able to add tee-pubkey claim in the token. It's sha256 for SGX and sha512 for TDX where the nonce must come from Amber.

Xynnn007 commented 11 months ago

AFAICS, the main problem is Amber does not understand sha384(nonce || tee-pubkey)

Well. If we just want to resolve this, KBS does not need to use amber.GetNonce API as this nonce is different from KBS's. Amber's nonce will not be included inside the Amber token, neither be included into evidence. It is only for calling the API of GetToken, but KBS's nonce needs to be included in the evidence.

Back to the solution, we only need to extract report data from the AmberToken and check whether it is the same as hash(nonce, pubkey).

Xynnn007 commented 11 months ago

From another perspective, Amber's nonce is related to its own handshake protocol, which is similiar with KBS RCAR. We should not couple them.

mythi commented 11 months ago

but KBS's nonce needs to be included in the evidence.

Maybe I'm missing something fundamental. I see you mean about the two nonces but what would be the way to pass KBS's nonce to Amber so that it can verify the evidence (attester's report data includes KBS nonce, right?)?

Xynnn007 commented 11 months ago

but KBS's nonce needs to be included in the evidence.

Maybe I'm missing something fundamental. I see you mean about the two nonces but what would be the way to pass KBS's nonce to Amber so that it can verify the evidence (attester's report data includes KBS nonce, right?)?

We do not even need to tell Amber anything about the KBS nonce. It and teepubkey's digest is included inside the tdxquote.reportdata. What we need to do is to add some logic in Amber plugin in KBS to parse the reportdata and check whether it is the same as expected as I mentioned.

mythi commented 11 months ago

After a bit of reading and thinking I'm more convinced that 1. is necessary. The topic of evidence freshness is discussed in the RATS architecture and Appendix 5. suggests the same model as 1.

My thinking is there isn't such thing as "KBS nonce" and it is right now, we cannot meet all the criteria specified for an evidence.

"After checking that the sent and received nonces are the same, the appraising entity knows that the Claims were signed after the nonce was generated. " summarizes it well.

mythi commented 11 months ago

Added item 4.

confidential-containers / trustee

KBS implementation observations #162