confidential-computing / governance

Confidential Computing Consortium Governance Documents
68 stars 30 forks source link

Whitepaper feedback from Muhammad Usama Sardar #77

Open dthaler opened 3 years ago

dthaler commented 3 years ago

Muhammad Usama Sardar 10:34 AM Here are my questions/comments: • "A Trusted Execution Environment (TEE) is commonly defined as an environment that provides a level of assurance of data integrity, data confidentiality, and code integrity." Where is it defined like this? I thought CCC is defining this way (as opposed to commonly being defined like this) What would be the scientific argument for not including attestation in the definition of TEE?
• Table 1 "Secure Element e.g., TPM" Since it says e.g. TPM and not only TPM, what are some other secure elements considered here? If TPM is a specific example of "Secure Element" as implied by Table 1, why would it differ in some values in Figure 3 in the scope document?
• Why did CCC not consider HSM in Table 1 for a fair comparison with all existing technologies?
• Figure 1 is nice but many interesting explanations are missing: e.g. what does the overlap between TEEs and TPM represent? Similarly between HE and MPC? Similarly between Privacy-preserving computation and TEEs?
• The scientific reasoning/explanation of Table 2 is completely missing. While such tables are very nice for marketing purposes (and thus suitable for the other white paper), adding it in the "Technical Analysis" should accompany some scientific arguments with references of where such a table is derived from.
• Sec. 5.2.2: Why would availability attacks not be considered out of scope?
• Minor comment: The caption of Table 1 is not correct (I guess due to copy and paste from Table 1) as there is no TPM in Table 2

dthaler commented 3 years ago

@thomas-fossati can you take a shot at going through these?

thomas-fossati commented 3 years ago

"A Trusted Execution Environment (TEE) is commonly defined as an environment that provides a level of assurance of data integrity, data confidentiality, and code integrity." Where is it defined like this? I thought CCC is defining this way (as opposed to commonly being defined like this)

This comment seems OBE. Looking at version 1.1, Section 3.1 has now: "A Trusted Execution Environment (TEE) is defined by the CCC, following common industry practice, […]", which clearly attributes this definition to CCC.

What would be the scientific argument for not including attestation in the definition of TEE?

To be discussed.

A question for @muhammad-usama-sardar: are there existing definitions of TEE that include attestation?

Table 1 "Secure Element e.g., TPM" Since it says e.g. TPM and not only TPM, what are some other secure elements considered here?

It seems to me this is TPM or at least TPM-like, rather than a generic Secure Element. If so, we should fix the table header.

If TPM is a specific example of "Secure Element" as implied by Table 1, why would it differ in some values in Figure 3 in the scope document?

There is no Figure 3 in the January 2021, v1.1 copy of the document. @muhammad-usama-sardar could you please check if your comment refers to an older version? Or did you mean a different figure?

Why did CCC not consider HSM in Table 1 for a fair comparison with all existing technologies?

To be discussed.

My take is TPMs are relevant here because they provide a root of trust on which subsequent trusted/confidential computing functions can be anchored to. HSMs are more general purpose crypto offloading boxes rather than trusted computing technologies.

Figure 1 is nice but many interesting explanations are missing: e.g. what does the overlap between TEEs and TPM represent? Similarly between HE and MPC? Similarly between Privacy-preserving computation and TEEs?

The elements of the sets are the existing definitions of the various concepts. The overlaps represent ambiguity in the definitions that allow one class to be in a continuum with another.

The scientific reasoning/explanation of Table 2 is completely missing. While such tables are very nice for marketing purposes (and thus suitable for the other white paper), adding it in the "Technical Analysis" should accompany some scientific arguments with references of where such a table is derived from.

To be discussed.

Muhammad notes the lack of data / references to support the qualitative claims in Table 2 (scalability comparison).

Sec. 5.2.2: Why would availability attacks not be considered out of scope?

To be discussed.

Move Section 3.1 antepenultimate para (which talks about availability attacks) to section 5.2.2 (“out of scope attacks”).

Editorial nit: in the same para we conclude with "Further discussion of the threat model is provided in section 6." which should reference section 5 instead.

Minor comment: The caption of Table 1 is not correct (I guess due to copy and paste from Table 1) as there is no TPM in Table 2

This seems to refer to a previous version of the document. @muhammad-usama-sardar could you please re-check this?

muhammad-usama-sardar commented 3 years ago

Thank you @thomas-fossati for sharing CCC opinion. Please see clarifications/comments below:

"A Trusted Execution Environment (TEE) is commonly defined as an environment that provides a level of assurance of data integrity, data confidentiality, and code integrity." Where is it defined like this? I thought CCC is defining this way (as opposed to commonly being defined like this)

This comment seems OBE. Looking at version 1.1, Section 3.1 has now: "A Trusted Execution Environment (TEE) is defined by the CCC, following common industry practice, […]", which clearly attributes this definition to CCC.

Please note this comment was related to the white paper titled "Confidential Computing: Hardware-Based Trusted Execution for Applications and Data". The latest version (v1.2) linked above still contains the exact same statement, as I quoted, on Page 5 under "What are Trusted Execution Environments?"

What would be the scientific argument for not including attestation in the definition of TEE?

To be discussed.

A question for @muhammad-usama-sardar: are there existing definitions of TEE that include attestation?

I will add details on this but what comes to my mind currently is the Edgeless white paper. It does not exactly define TEE, but rather for confidential computing, it includes verifiability (attestation) as a key feature. One could argue that since confidential computing makes use of TEEs (as per CCC definition), the white paper implicitly implies verifiability (attestation) as a key (in contrast to CCC's optional) feature of TEEs.

Table 1 "Secure Element e.g., TPM" Since it says e.g. TPM and not only TPM, what are some other secure elements considered here?

It seems to me this is TPM or at least TPM-like, rather than a generic Secure Element. If so, we should fix the table header.

If TPM is a specific example of "Secure Element" as implied by Table 1, why would it differ in some values in Figure 3 in the scope document?

There is no Figure 3 in the January 2021, v1.1 copy of the document. @muhammad-usama-sardar could you please check if your comment refers to an older version? Or did you mean a different figure?

Please note that I was referring to Figure 3 in the scope document, as I mentioned in my comment.

Figure 1 is nice but many interesting explanations are missing: e.g. what does the overlap between TEEs and TPM represent? Similarly between HE and MPC? Similarly between Privacy-preserving computation and TEEs?

The elements of the sets are the existing definitions of the various concepts. The overlaps represent ambiguity in the definitions that allow one class to be in a continuum with another.

I think that needs to be explicitly clarified in the white paper. One could also interpret the overlaps as the combination of the technologies. Anyway, your interpretation does not make much sense to me, or otherwise, the figure is completely flawed. For example, going with your interpretation, what exactly is the ambiguity in the definition of TPMs and Hardware TEEs that leads to the overlap between the two? I think we all agree that TPMs are not TEEs. What makes TPMs lie completely inside TEEs in the first place?

Minor comment: The caption of Table 1 is not correct (I guess due to copy and paste from Table 1) as there is no TPM in Table 2

This seems to refer to a previous version of the document. @muhammad-usama-sardar could you please re-check this?

Sorry for a typo (I commented later on slack about it but it did not reflect here). Please read "caption of Table 1" as "caption of Table 2". This still exists in the latest version (v1.1) of CCC technical white paper.

muhammad-usama-sardar commented 2 years ago

A comprehensive draft of major issues in CCC white papers with some recommendations is available here. Happy to discuss or present to TAC to move things forward!

howard55 commented 2 years ago

The definitions between Confidential Computing and TEE are very similar in the whitepaper, we should give a distinguishing definition.

thomas-fossati commented 2 years ago

I have gone through "Confidential Computing and Related Technologies: A Review" and extracted Table 1, which in the intention of the authors summarises the issues with the current version of the CCC white paper. I've added a column to that with an initial set of my own comments to structure the discussion at our next TAC meeting.

Term Issue Thomas notes Muhammad recommendation
CC HW TEE vs programmable TEE This refers to the scope document and seems to create a false comparison. Claim on unique definition of CC and multiple conflicting definitions of other technologies should be removed
CC Conflicting definition by Arm RSH Not sure what this is supposed to mean.  My esteemed colleagues have the right to their own opinions which is not necessarily the consensus of the CCC.  I also note, the CC definition in the cited Arm RSH paper is not conflicting: it's simply a superset of what the CCC has come up to, which includes FHE, MPC and isolation via formally verified SW/FW (isolation kernels, for example based on seL4) ditto
CC Other technologies, e.g., HE, are formally defined The fact the HE can be formally defined is not a surprise since it's  a mathematical object. CC and TEE are not built on the same mathematical rigour so providing a definition in natural language seems more than appropriate. ditto
TEE Ambiguous terms "Level of assurance" seems to be the ambiguous term in this context.  Maybe substituting it with "A TEE is defined by the CCC, following common industry practice, as an environment that provides the following security guarantees:"  — or "A TEE is an architectural primitive that provides the following (strong) integrity and confidentiality guarantees to code and data against the set of attacks defined in the threat model (section 5)" A clear and distinguishing definition should be given
TEE Definition satisfied by HSM also Effectively given the current definition an HSM would be a  special kind of TEE.  It doesn't look like a problem to me. ditto
TEE Unclear threat model Not sure what are the basis of this assertion: The whole of section 5 is about threat modelling, defining what is in scope and what isn't. ditto
TEE Environment Undefined I agree with Muhammad that it's not clear what a "TEE Environment" is and how that'd be different from a TEE. Suggest rephrasing as "Attestation of TEEs to ensure valid and correct deployment" — the doubt seems to be around whether TEE means the computation itself (code & data), the execution environment, or both. The term should be rephrased. It should be compared and contrasted with TEE for clarity.
HW TEE Undefined In section 2.2. there actually is discussion about the distinction in terms of an abstract HW vs SW Root of Trust.  Isn't this sufficient? It should be compared and contrasted with virtualised SW TEE for clarity.
Programmability Arbitrary code vs limited set of operations Not sure how Muhammad's suggestion helps. Say that by programmability we mean Turing-complete.
Attestation Ambiguous notion of trustworthiness In section 6 WP says: "[…] derive confidence in the trustworthiness of the Attester by obtaining an authentic, accurate and timely report about the software and data state of the Attester".  Isn't this already addressing the comment? Include some form of trusted measurement
Attestation in CC Incomplete definition of CC This was discussed at length and the decision was to acknowledge its importance (there's a whole dedicated section) but leave it out of the core definition because it was not strictly necessary to describe a TEE. The definition of CC should include "attestation" as one of the primitives alongside the isolate/TEE primitive

[TF] General note about the Venn diagram. The elements of the sets are the existing definitions of the various concepts. The overlaps represent ambiguity in the definitions that allow one class to be in a continuum with another. ISTM that this diagram can be superficially interpreted as the CCC take instead of a CCC survey of the existing terminology, hence the confusion. Maybe we should remove it and only have prose for that?

[TF] Typo in Section 2.1, first para: "(See Section 4" should be "(See Section 3" instead.

muhammad-usama-sardar commented 2 years ago

Thank you Thomas for finally making some progress on it. First and foremost, your comment completely ignores section 3 of the paper, which contains some critical flaws such as missing threat model for the comparison of technologies, and that TPM is a TEE, and so on. (Please note that as mentioned at the end of section 2 in the paper, Table 1 in the paper which you have quoted above summarizes only section 2.)

Next, here are some important clarifications/questions/follow-up on some of your comments above:

This refers to the scope document and seems to create a false comparison.

Scope document is also a document produced by CCC. Can you explain why do you think that the comparison is false?

My esteemed colleagues have the right to their own opinions

Your esteemed colleagues are equally respected by us. So please don't try to paint a personal picture out of this, and rather focus on the scientific arguments.

I also note, the CC definition in the cited Arm RSH paper is not conflicting: it's simply a superset of what the CCC has come up to

It is quite surprising that you agree that the definition by Arm is superset of CCC definition and still you say that it is not conflicting. In layman terms, Germany is superset of Dresden. Being in Dresden implies being in Germany, but being in Germany does not necessarily imply being in Dresden: one may be in Berlin, then it is still within Germany but not in Dresden. So just as Germany != Dresden, definition by Arm != definition of CCC, and hence the CCC claim of unique definition of CC is already proved to be false.

The fact the HE can be formally defined is not a surprise

Indeed. But what is surprising is that CCC says that this precisely defined mathematical object has "multiple competing definitions" and "ambiguities", without justifying what these competing definitions are and what these ambiguities are.

providing a definition in natural language seems more than appropriate

Whether it is appropriate or not is a different story. But indeed the point is that definition in natural language can never be unique and hence the suggestion to remove the claim of unique definition.

an HSM would be a special kind of TEE

If HSM is a special kind of TEE, and given that HSMs exist for decades, what exactly is new in confidential computing?

In section 2.2. there actually is discussion about the distinction in terms of an abstract HW vs SW Root of Trust. Isn't this sufficient?

That is indeed insufficient. How exactly is a hardware TEE defined? There is always software in the TCB. As concrete examples, in which category would you classify Microsoft Virtual Secure Mode, Amazon Nitro and RISC-V?

This was discussed at length and the decision was to acknowledge its importance (there's a whole dedicated section) but leave it out of the core definition because it was not strictly necessary to describe a TEE.

What is the scientific argument of CCC for this decision? When exactly did this discussion happen? Please share the links to the recording of this discussion. Please recall that the CCC TAC is a transparent organization, which means the scientific process of reaching this decision must be transparent to the community.

ISTM that this diagram can be superficially interpreted as the CCC take instead of a CCC survey of the existing terminology

I think it's been a year and we have been asking the CCC for the references used in the survey. I don't see any good reason why CCC TAC, being a transparent organization, would not show the references of this survey to the community. Anyway, what exactly is the difference between "CCC take" and "CCC survey"? Quoting from the white paper:

The TAC conducted a survey of various terms in the industry related to protecting data in use, and composed the following Venn diagram of technologies:

To me, it is clear to imply that TAC conducted a survey and then take from the survey is the Venn diagram. By your comment that it is not CCC take, does it mean that it was just copied and pasted from somewhere? (in this case "composed" in the white paper is misleading, and my question again: what is the source from where it is taken?) The figure has so many flaws, and I highly doubt it would come from some reliable source.

[TF] Typo in Section 2.1, first para: "(See Section 4" should be "(See Section 3" instead.

Please note that not only this but also almost all other cross-references to the sections are messed up. Usually they are referring to n+1, where n should be the correct section.

thomas-fossati commented 2 years ago

[...] CCC claim of unique definition of CC

I understand that the sentence

"[...] unlike the term "confidential computing", some of the terms in the diagram have multiple competing definitions"

may allow the reader to think that CC has a unique possible definition, but I think you are reading too much into that sentence.

In general, the CCC can only recommend that its definition of CC is used, but it can't mandate it and least of all enforce it.

We can suggest to rephrase that as I think it's not what was intended: in fact, if you confront it with the scope document - from where this originally comes from - there's no such ambiguity.

muhammad-usama-sardar commented 2 years ago

"[...] unlike the term "confidential computing", some of the terms in the diagram have multiple competing definitions"

Both the white paper and scope document use a much stronger statement than the one you wrote: "unlike the term “confidential computing”, several of the terms used in the diagram have multiple competing definitions."

in fact, if you confront it with the scope document - from where this originally comes from - there's no such ambiguity.

Why leave the ambiguity of which terms in the diagram have multiple competing definitions? Why not specify which of the terms from the survey of CCC have multiple competing definitions and how exactly are they competing?

thomas-fossati commented 2 years ago

Both the white paper and scope document use a much stronger statement than the one you wrote:

Using "some" instead of "several" does not change the substance.

The root of the cognitive dissonance is rather this part:

"[...] unlike the term “confidential computing”,"

which is not present in the scope document and was added during editing of the white paper.

We can ask to amend it.

muhammad-usama-sardar commented 2 years ago

Removing "unlike the term confidential computing" solves only one part of the problem. Can you state here exactly the following:

thomas-fossati commented 2 years ago

Removing "unlike the term confidential computing" solves only one part of the problem.

cool, 千里之行,始於足下 ;-)

Can you state here exactly the following:

  • Exactly which of the terms in the figure have multiple competing definitions?
  • And how exactly are they competing?
  • From which source did you take these competing definitions of the terms? and
  • What exactly is the ambiguity of the overlap between the different terms?

I think that memory should be recoverable from the scope document.

muhammad-usama-sardar commented 2 years ago

I think that memory should be recoverable from the scope document.

Maybe I asked too many questions at once. So let's move one by one. Can you give a complete list of all terms in the figure which have multiple competing definitions? This question is not answered by any of the CCC papers. The scope document and white paper both claim "several terms" but fail to provide anything other than one example of “privacy-preserving computation”.