ietf-teep / architecture

TEEP architecture draft
5 stars 9 forks source link

Benjamin M. Schwartz SECDIR review #251

Closed dthaler closed 1 year ago

dthaler commented 2 years ago

https://datatracker.ietf.org/doc/review-ietf-teep-architecture-16-secdir-lc-schwartz-2022-04-04/

This draft is (obviously) highly relevant to the Security Area. It is clear and well-written, but the complexity of subject matter leads to some difficulties and oversights.

Section 1: The use of the term "applications" carries an implication of a client-side device with installable software, but TEEP seems to extend also to server software sharing a kernel, hypervisors sharing a mainboard, etc. A term like "software" would be more neutral.

"An application component ... is referred to as a Trusted Application": This is confusing. A component, explicitly not an entire "application", is referred to as an "application". "Trusted Component" would be more consistent. Also, "trusted" seems to be the wrong adjective here, as it is the environment, and not the software, that carries an elevated level of trust. "Isolated" might be a better descriptor.

If this is common terminology for the field, a citation would be good.

I would appreciate some discussion of whether the Device Owner needs to trust the Trusted Application, i.e. interaction between enclaves and sandboxes.

"verify the ... rights of TA developers": "rights" is a loaded term. Rather than get into constitutional law, consider "permissions".

"so that the Untrusted Application can complete" -> "so that installation of the Untrusted Application can complete".

"is considered confidential" -- By whom? From whom? Consider "A developer who wants to provide a TA without revealing its code to the device owner..."

"A TEE ... wants to determine" ... excessive personification. I suggest "needs".

Section 2: "it is more common for the enterprise to own the device, and any device user has no or limited administration rights": Grammar issue. Perhaps "and for any device user to have ...".

Section 3.1 "trusted user interface" ... can you cite an example of a mobile device with a trusted peripheral that is not accessible to the REE OS? This seems theoretical.

Section 3.3 Similarly, are there any examples of IoT devices that prevent the REE OS from operating certain actuators?

Section 4.1 the TAM cannot directly contact a TEEP Agent, but must wait for the TEEP Broker to contact the TAM requesting a particular service. This architecture is intentional in order to accommodate network and application firewalls

This is true in many use cases, but for Confidential Cloud the reverse logic applies. In fact, the TAM could be operating on-site inside an enterprise, requiring a firewall exception to be reachable from the TEEP Broker. This architecture is also unnatural: it converts an event-driven "update command" into a polling loop that adds delay and wastes resources. Why is this part of the TEEP architecture? Surely it could be handled by a reversal-of-control pattern one layer below TEEP (e.g. Server-sent events)?

I think the real motivation here is (1) installation is presumed to be triggered locally, by the Untrusted Application, so the TAM must be reachable as a "server", and the other side naturally should keep the client role; (2) the TAM is intended to have O(1) state while serving N devices.

 For a TAM to be successful,
  it must have its public key or certificate installed in a device's
  Trust Anchor Store.

This needs discussion of threat model. What damage can a hostile TAM do? What does the device administrator need to know for adding a trust anchor to be safe?

Section 4.4 Implementations must support encryption of such Personalization Data to preserve the confidentiality of potentially sensitive data contained within it,

Implementations of what?

and must support integrity protection of the Personalization Data.

Lower-case "must" without explanation. Why, and is this a normative requirement?

Section 4.4.2 "e.g., OP-TEE" -> What is this?

Section 5.4

When a PKI is used, many intermediate CA certificates can chain to a root certificate, each of which can issue many certificates.

Intermediate CAs have a troubled history (e.g. [1]), and techniques that make them safer (e.g. x.509 name constraints) can't be deployed as a retrofit. Does TEEP need some rules about supported x.509 extensions?

Section 6.2.1

If an Untrusted Application is summarily deleted, how do you avoid leaking the TA?

Section 7

TEEP is format-agnostic for attestations, but what about message-sequence-agnostic? Can it tunnel arbitrary challenge-response sequences?

Section 9.3

We have already seen examples of attacks on the public Internet with billions of compromised devices being used to mount DDoS attacks.

Citation please. Also, are you sure it has reached into billions?

Section 9:

Nothing here seems to discuss attacks on the TEE's properties, and the post-compromise implications of those attacks. For example, if all instances of a TA share a secret key, used for decrypting the Personalization Data, then a single successful attack on a TEE is sufficient to decrypt all Personalization Data (previous and future). Given the prevalence of such attacks (especially via hardware fault injection), it seems likely to be worth mentioning. [1] https://arstechnica.com/information-technology/2015/03/google-warns-of-unauthorized-tls-certificates-trusted-by-almost-all-oses/

mingpeiwk commented 2 years ago

Issue #255 with Roman's email contains several comments that Ben provided. Some comments in this issue are not in #255.

Section 1: The use of the term "applications" carries an implication of a client-side device with installable software, but TEEP seems to extend also to server software sharing a kernel, hypervisors sharing a mainboard, etc. A term like "software" would be more neutral.

This would be a major change in the doc. We will need to review for a consensus. @dthaler @hannestschofenig

"An application component ... is referred to as a Trusted Application": This is confusing. A component, explicitly not an entire "application", is referred to as an "application". "Trusted Component" would be more consistent.

The doc has defined a TA to mean possibly an "application component" in the Terminology Section as follows:

"- Trusted Application (TA): An application (or, in some implementations, an application component) that runs in a TEE"

The change would be a significant change across the doc, including APIs to be consistent when "Trusted Application" is replaced with "Trusted Component" everywhere. Can we continue to use the above definition to cover both cases?

Also, "trusted" seems to be the wrong adjective here, as it is the environment, and not the software, that carries an elevated level of trust. "Isolated" might be a better descriptor.

If this is common terminology for the field, a citation would be good.

Not sure that a name "Isolated Component" is more natural than "Trusted Component".

I would appreciate some discussion of whether the Device Owner needs to trust the Trusted Application, i.e. interaction between enclaves and sandboxes.

A Device Owner may not always have control over TAs that go to its device regardless of the owner's trust. A Device Administrator may often decides when the Administrator is different from the owner. Secondly, the trust to a TA is delegated to the TAMs that may manage the TAs for installation to devices.

@dthaler @hannestschofenig how do you think?

"verify the ... rights of TA developers": "rights" is a loaded term. Rather than get into constitutional law, consider "permissions".

Agreed. Fixed.

"so that the Untrusted Application can complete" -> "so that installation of the Untrusted Application can complete".

Agreed. Fixed.

"is considered confidential" -- By whom? From whom? Consider "A developer who wants to provide a TA without revealing its code to the device owner..."

Agreed. Just note that the developer may not even want to share the code with a TAM that distributes the binary. So it isn't only protecting from a device owner. Proposed fix:

A Trusted Component might also be encrypted,
if the code is considered confidential, for example, when a developer wants to 
provide a TA without revealing its code to others.

"A TEE ... wants to determine" ... excessive personification. I suggest "needs".

Agreed. Fixed.

Section 2: "it is more common for the enterprise to own the device, and any device user has no or limited administration rights": Grammar issue. Perhaps "and for any device user to have ...".

Agreed. Adopted in fix.

Section 3.1 "trusted user interface" ... can you cite an example of a mobile device with a trusted peripheral that is not accessible to the REE OS? This seems theoretical.

Need help from co-authors here too.

Section 3.3 Similarly, are there any examples of IoT devices that prevent the REE OS from operating certain actuators?

To follow up.

Section 4.1 the TAM cannot directly contact a TEEP Agent, but must wait for the TEEP Broker to contact the TAM requesting a particular service. This architecture is intentional in order to accommodate network and application firewalls

This is true in many use cases, but for Confidential Cloud the reverse logic applies. In fact, the TAM could be operating on-site inside an enterprise, requiring a firewall exception to be reachable from the TEEP Broker. This architecture is also unnatural: it converts an event-driven "update command" into a polling loop that adds delay and wastes resources. Why is this part of the TEEP architecture? Surely it could be handled by a reversal-of-control pattern one layer below TEEP (e.g. Server-sent events)?

I think the real motivation here is (1) installation is presumed to be triggered locally, by the Untrusted Application, so the TAM must be reachable as a "server", and the other side naturally should keep the client role; (2) the TAM is intended to have O(1) state while serving N devices.

Right, that was the main use case and motivation for the architecture.

For a TAM to be successful, it must have its public key or certificate installed in a device's Trust Anchor Store. This needs discussion of threat model. What damage can a hostile TAM do? What does the device administrator need to know for adding a trust anchor to be safe?

This has been addressed in "Section 9.5 Compromised TAM" in a version after the review was posted.

Section 4.4 Implementations must support encryption of such Personalization Data to preserve the confidentiality of potentially sensitive data contained within it,

Implementations of what?

Changed to "Implementations of TEEP protocol"

and must support integrity protection of the Personalization Data.

Lower-case "must" without explanation. Why, and is this a normative requirement?

We chose to not use MUST in the doc to differentiate "normative" from "informal" requirements. To discuss this. @dthaler @hannestschofenig

Section 4.4.2 "e.g., OP-TEE" -> What is this?

Changed to "e.g., OP-TEE, an open source TEE"

Section 5.4

When a PKI is used, many intermediate CA certificates can chain to a root certificate, each of which can issue many certificates.

Intermediate CAs have a troubled history (e.g. [1]), and techniques that make them safer (e.g. x.509 name constraints) can't be deployed as a retrofit. Does TEEP need some rules about supported x.509 extensions?

We leave this to the device provider for the constraints on X509 extensions it supports and uses in Trust Anchor validation. They may select to trust only a selected intermediate CA instead of the root as the Trust Anchor.

Section 6.2.1

If an Untrusted Application is summarily deleted, how do you avoid leaking the TA?

Similar to the removal of a buggy or malicious TA, this is up to a device to have some scheme to contact TAM or be contacted to initiative a removal of TAs that are not needed anymore.

Section 7

TEEP is format-agnostic for attestations, but what about message-sequence-agnostic? Can it tunnel arbitrary challenge-response sequences?

There is some limitation in TEEP about supporting arbitrary challenge-response sequence. It generally complies with RATS recommended sequence. A single pass attestation and verifier flow is assumed. A challenge may be supported by sending it in a request from a TAM to the device where the device will combine the challenge in its attestation evidence generation.

Section 9.3

We have already seen examples of attacks on the public Internet with billions of compromised devices being used to mount DDoS attacks.

Citation please. Also, are you sure it has reached into billions?

Changed to "a large number".

Section 9:

Nothing here seems to discuss attacks on the TEE's properties, and the post-compromise implications of those attacks. For example, if all instances of a TA share a secret key, used for decrypting the Personalization Data, then a single successful attack on a TEE is sufficient to decrypt all Personalization Data (previous and future). Given the prevalence of such attacks (especially via hardware fault injection), it seems likely to be worth mentioning.

See replies to the same ask in the Issue #255

[1] https://arstechnica.com/information-technology/2015/03/google-warns-of-unauthorized-tls-certificates-trusted-by-almost-all-oses/

dthaler commented 2 years ago

Issue #255 with Roman's email contains several comments that Ben provided. Some comments in this issue are not in #255.

Section 1: The use of the term "applications" carries an implication of a client-side device with installable software, but TEEP seems to extend also to server software sharing a kernel, hypervisors sharing a mainboard, etc. A term like "software" would be more neutral.

This would be a major change in the doc. We will need to review for a consensus. @dthaler @hannestschofenig

The protocol spec uses the term "Trusted Component".

"An application component ... is referred to as a Trusted Application": This is confusing. A component, explicitly not an entire "application", is referred to as an "application". "Trusted Component" would be more consistent.

The doc has defined a TA to mean possibly an "application component" in the Terminology Section as follows:

"- Trusted Application (TA): An application (or, in some implementations, an application component) that runs in a TEE"

The change would be a significant change across the doc, including APIs to be consistent when "Trusted Application" is replaced with "Trusted Component" everywhere. Can we continue to use the above definition to cover both cases?

The protocol spec already uses the term "Trusted Component".

Also, "trusted" seems to be the wrong adjective here, as it is the environment, and not the software, that carries an elevated level of trust. "Isolated" might be a better descriptor.

If this is common terminology for the field, a citation would be good.

Not sure that a name "Isolated Component" is more natural than "Trusted Component".

I believe it's the right adjective. By trusting the TAM, you also trust (to not be malicious) the components the TAM installs.

I would appreciate some discussion of whether the Device Owner needs to trust the Trusted Application, i.e. interaction between enclaves and sandboxes.

A Device Owner may not always have control over TAs that go to its device regardless of the owner's trust. A Device Administrator may often decides when the Administrator is different from the owner. Secondly, the trust to a TA is delegated to the TAMs that may manage the TAs for installation to devices.

@dthaler @hannestschofenig how do you think?

See my previous response above. I agree with Ming's statement that the trust in a TA is delegated to TAMs that manage the TAs (or Trusted Components) in the TEE. So indirectly yes the Device Administrator trusts the Trusted Components, by virtue of trusting the TAM to decide which ones are trusted and appropriate.

Section 3.1 "trusted user interface" ... can you cite an example of a mobile device with a trusted peripheral that is not accessible to the REE OS? This seems theoretical.

Need help from co-authors here too.

TEEP is not just for mobile phones, but also POS devices like chip-and-pin readers, and that facility is present in some of those, not theoretical. I'd change

A trusted user interface (UI) may be used in a mobile device to A trusted user interface (UI) may be used in a payment device

and maybe update the text in the preceding paragraphs to just use "mobile" device as an example, or say "mobile device or point-of-sale device".

Section 3.3 Similarly, are there any examples of IoT devices that prevent the REE OS from operating certain actuators?

To follow up.

Yes. The GlobalPlatform architecture discusses this so there are GP compliant devices that do. I demoed one (just a demo not a product, but it actually did so) myself at Hannover-Messe (the largest industrial fair) a few years back so I am personally familiar with the concept in practice.

and must support integrity protection of the Personalization Data.

Lower-case "must" without explanation. Why, and is this a normative requirement?

We chose to not use MUST in the doc to differentiate "normative" from "informal" requirements. To discuss this. @dthaler @hannestschofenig

Some parts of the IETF have a negative reaction to using normative MUST type language in an informational architecture document, so this uses normal English rather than 2119 language per such feedback.

mingpeiwk commented 2 years ago

and maybe update the text in the preceding paragraphs to just use "mobile" device as an example, or say "mobile device or point-of-sale device".

Good suggestion @dthaler. Updated in the doc.

mingpeiwk commented 2 years ago

For a TAM to be successful, it must have its public key or certificate installed in a device's Trust Anchor Store. This needs discussion of threat model. What damage can a hostile TAM do? What does the device administrator need to know for adding a trust anchor to be safe?

This has been addressed in "Section 9.5 Compromised TAM" in a version after the review was posted.

See a similar issue #241 and a revised section of 9.5 with #261. Realized that "a hostile TAM" is different threat from that of a compromised TAM while the later intends to become a hostile TAM.

We can expand the Section 9.5 to include "hostile TAM" detection and mitigations. With the current text, plan to add:

"There are also threats of hostile or abusive TAMs where a TAM turns to act out of expectation of device administrators, for example, pushing out TAs that contain some data collection or use users' device resources for distributed jobs for a TAM. The mitigation methods for a compromised TAM case above can also apply to these threats.

A Device Administrator may need to be notified when such an incident takes place. An Untrusted Application or some software component in the REE may monitor detection output from the device's TEEs, and notify the Device Administrator for additional remediations."

mingpeiwk commented 2 years ago

Add some further elaboration for this comment:

TEEP is format-agnostic for attestations, but what about message-sequence-agnostic? Can it tunnel arbitrary challenge-response sequences?

TEEP protocol supports challenge-response input, using a token in QueryRequest, for EAT creation in TEEP Agent. A TAM may issue more than one such exchanges to get multiple attestation evidences or results till its attestation validation policy is met (with a Verifier to assist individual validation). So this is flexible to support almost arbitrary c/r sequences.

mingpeiwk commented 2 years ago

Ben Schwartz's additional comments and Brendon Moran's input.

Brendon Moran:

Section 3.1

There are a few examples in industry. The PCI-SIG has announced some relevant work: https://pcisig.com/specifications

Apple’s Platform Security Guide has a few relevant sections on pages 17-21. For us, I suspect these two excerpts are most relevant: https://help.apple.com/pdf/security/en_US/apple-platform-security-guide.pdf

Between the PCI-Sig work and Apple’s platform security guide we have an example of a specification for trusted I/O and an example of a mobile device manufacturer implementing trusted I/O. This is not theoretical; it’s in production.

Section 3.3: examples of IoT devices that prevent the REE OS from operating certain actuators?

While I can’t provide specific examples of where this was done, I can provide specific examples of where it should have been done. Medical devices with network or radio connectivity (therefore IoT medical devices) should place all sensors and actuators under the control of a TEE.

Ben Schwartz:

I would suggest adding informative references for the examples that exist, and clarifying that the actuator example is hypothetical if there is no worked example.

Note that Dave gave an example above:

The GlobalPlatform architecture discusses this so there are GP compliant devices that do. I demoed one (just a demo not a product, but it actually did so) myself at Hannover-Messe (the largest industrial fair) a few years back so I am personally familiar with the concept in practice.

Doc update: For the doc update, we add a reference to GPTEE.

mingpeiwk commented 2 years ago

Add Ben's confirmation on several fixes: https://mailarchive.ietf.org/arch/msg/teep/0L_KMigRB0bTr1Wi3RpHfwEkAtA/. This includes replies above about use of word "Trusted Application" and examples for IoT constraint devices and Trusted UI.