RFC: Add support for KSS (key separation and sharing)

gramineproject / gramine

A library OS for Linux multi-process applications, with Intel SGX support

GNU Lesser General Public License v3.0

590 stars 194 forks source link

RFC: Add support for KSS (key separation and sharing) #800

Open DL8 opened 2 years ago

DL8 commented 2 years ago

KSS (key separation and sharing) provides additional fields to SECS and EGETKEY, which allows more fine-grained control over remote attestation and key derivation. Some examples where it might be useful:

The same enclave binary may be executed in the context of two different processes (they might even belong to different users). They will have the same MRSIGNER and MRENCLAVE, so without they will derive the same keys, which might lead to unintended data disclosure
A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)
An application may consist of several enclaves ("microservices"), which may want to share keys (e.g. for shared protected storage)

Several places where changes may be required (not a complete list and open for discussion):

The loader should support enumeration (CPUID) and setting new SECS attributes (CONFIGID, CONFIGSVN, ISVEXTPRODID, ISVFAMILYID)
A mechanism to define KSS requirement should be added to the manifest (one option: must enable, may enable, must disable)
EGETKEY wrapper should support new key policy attributes and new fields in key request structure (e.g. CONFIGSVN)

mkow commented 2 years ago

The same enclave binary may be executed in the context of two different processes (they might even belong to different users). They will have the same MRSIGNER and MRENCLAVE, so without they will derive the same keys, which might lead to unintended data disclosure

I don't think this is the case in Gramine model:

The entrypoint path and contents are measured in almost all cases (the only exception is when it's provided via encrypted files, but then see below).
The true "identity" of the enclave will always be its MRENCLAVE in Gramine model. If we change the identity on execve() from "Gramine ver XYZ+binary QWER" to "Gramine ver XYZ+binary ASDF" then it may seem that the current identity is really "Gramine ver XYZ+binary ASDF", while in fact QWER could make persistent changes inside the enclave and still be in control. This is because Gramine doesn't provide in-enclave sandboxing (this is technically infeasible on SGX).

A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)

This sounds more relevant to Gramine, but why wouldn't the config be placed in the manifest? Could you describe some real-world use case for this?

An application may consist of several enclaves ("microservices"), which may want to share keys (e.g. for shared protected storage)

But they already can, via our remote attestation APIs? This seems unrelated to KSS, it's just about attestation?

kailun-qin commented 2 years ago

The same enclave binary may be executed in the context of two different processes (they might even belong to different users). They will have the same MRSIGNER and MRENCLAVE, so without they will derive the same keys, which might lead to unintended data disclosure

A well-written enclave (app) shall not disclose any confidential data. Well, how to guarantee an enclave (app) is correct (w/o violating any security/safety gurantees) is an orthogonal and hard problem. So for this case, even they're the very same enclaves, they should be able to unseal but cannot lead to data disclosure as long as they're well-written/correct.

The true "identity" of the enclave will always be its MRENCLAVE in Gramine model. If we change the identity on execve() from "Gramine ver XYZ+binary QWER" to "Gramine ver XYZ+binary ASDF" then it may seem that the current identity is really "Gramine ver XYZ+binary ASDF", while in fact QWER could make persistent changes inside the enclave and still be in control. This is because Gramine doesn't provide in-enclave sandboxing (this is technically infeasible on SGX).

Yes, but shouldn't the identity be "Gramine ver XYZ+binary QWER+binary ASDF"? The design rationale here of KSS (specifically CONFIGID in this case) is to indicate/reflect what post-EINIT additional content may be accepted rather than removing "binary QWER" out of TCB.

A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)

This sounds more relevant to Gramine, but why wouldn't the config be placed in the manifest? Could you describe some real-world use case for this?

AFAIRecall, it's targetting dynamic loading of code/data after enclave initialization. One of the envisioned typical scenarios is for language runtimes where the runtime is built, distributed and deployed as a single enclave binary while allowing end users to attach the config and modules it loads more easily. It's kind of for separation and composition. Together w/ CONFIGSVN it's easier for security updates.

An application may consist of several enclaves ("microservices"), which may want to share keys (e.g. for shared protected storage)

But they already can, via our remote attestation APIs? This seems unrelated to KSS, it's just about attestation?

I guess @DL8's intention here was to talk about the use case of ISVFAMILYID and ISVEXTPRODID (together w/ KEYREQUEST and KEYPOLICY) for sealing key derivation. This allows more flexible sealing key sharing schemes (thus for shared protected storage) on a local platform.

lejunzhu commented 2 years ago

A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)

This sounds more relevant to Gramine, but why wouldn't the config be placed in the manifest? Could you describe some real-world use case for this?

I can think of a few:

When the user gets a pre-built docker image, instead of building his own app.
The user may want to keep MRENCLAVE to a well known value, in order to collaborate with other Gramine apps via remote attestation.
Another case is the config will include MRENCLAVE values of more than one apps, therefore it can only be generated after building the app.

boryspoplawski commented 2 years ago

When the user gets a pre-built docker image, instead of building his own app. The user may want to keep MRENCLAVE to a well known value, in order to collaborate with other Gramine apps via remote attestation.

How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?

Another case is the config will include MRENCLAVE values of more than one apps, therefore it can only be generated after building the app.

But in Gramine MRENCLAVE reflects not only the app, but also Gramine itself and manifest (and hence initial filesystem state).

dimakuv commented 2 years ago

How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?

I guess the logic will be something like this:

In Gramine manifest, we specify a list of binaries/scripts that can be run using fs.mounts. This restricts which apps can run.
With the added KSS support, Gramine sets CONFIGID and CONFIGSVN (and maybe other new SGX fields) to some kind of secure hash of the loaded binary. This reflects the "currently running binary" in Gramine; it is reflected to remote enclaves/users in the CONFIGID and CONFIGSVN fields of the SGX Report.
Even more flexible is if we add a pseudo-file /dev/attestation/configid that the apps can write to. Then a modified Python interpreter, whenever it loads the main Python script, writes to configid to it. Then this reflects the "current running Python script" in Gramine.

So I think Gramine manifest already has the means to restrict the apps safe to run. And with KSS, we have an interesting new property of "SGX attestation reflects the currently running binary/script".

boryspoplawski commented 2 years ago

With the added KSS support, Gramine sets CONFIGID and CONFIGSVN (and maybe other new SGX fields) to some kind of secure hash of the loaded binary. This reflects the "currently running binary" in Gramine; it is reflected to remote enclaves/users in the CONFIGID and CONFIGSVN fields of the SGX Report.

What's the point? You can read /proc/pid/cmdline or something inside Gramine to get the current running app. Also you probably know already which app is currently running, so I really don't see a point.

So I think Gramine manifest already has the means to restrict the apps safe to run. And with KSS, we have an interesting new property of "SGX attestation reflects the currently running binary/script".

But what's the point? This cannot ever be a security feature - all the code (apps) running must trust each other anyway.

dimakuv commented 2 years ago

ou can read /proc/pid/cmdline or something inside Gramine to get the current running app. Also you probably know already which app is currently running, so I really don't see a point.

I'm not talking about the Gramine enclave itself. But rather about the remote parties and what information they can learn about the Gramine enclave based on the SGX Report received.

boryspoplawski commented 2 years ago

But the code handling remove attestation could just handle this - it knows when / in what app it runs

dimakuv commented 2 years ago

But the code handling remove attestation could just handle this - it knows when / in what app it runs

What are we talking about? Sorry, I am confused by "could just handle this" -- handle what exactly? The SGX Report is an Intel SGX Hardware defined struct, it's not a software construct. You cannot just add some fields to it as you wish.

boryspoplawski commented 2 years ago

What are we talking about?

You said the purpose of this feature is to know the "currently running binary" in Gramine. My point is that this is not a security feature (as explained before) and if you know all inputs that make the final MRENCLAVE (which you probably have to, if you trust it), then you already know what program is running inside Gramine.

You cannot just add some fields to it as you wish.

You can have any data appended (possibly encrypted) and embed hash of it in user data.

dimakuv commented 2 years ago

You can have any data appended (possibly encrypted) and embed hash of it in user data.

But how are you gonna check it? We currently use sgx_report.user_data for the hash of the public key of the ephemeral keypair generated inside of the SGX enclave (for RA-TLS purposes). If we put a hash of the "currently running binary" in this mix, the remote party will have a really hard time figuring out with which binary (out of a set of e.g. 100 possible binaries) it talks -- it will need to iterate through all 100 binaries, mix them with a public key of RA-TLS cert, calculate the hash and compare. Tedious.

dimakuv commented 2 years ago

My point is that this is not a security feature (as explained before)...

I guess I agree with this. It doesn't really add more security. It just makes attestation (more specifically, verification by the remote user) easier and more friendly, as it gives you a couple more fields -- in addition to sgx_report.user_data -- that can be updated and put inside the SGX report.

boryspoplawski commented 2 years ago

But how are you gonna check it? We currently use sgx_report.user_data for the hash of the public key of the ephemeral keypair generated inside of the SGX enclave (for RA-TLS purposes). If we put a hash of the "currently running binary" in this mix, the remote party will have a really hard time figuring out with which binary (out of a set of e.g. 100 possible binaries) it talks -- it will need to iterate through all 100 binaries, mix them with a public key of RA-TLS cert, calculate the hash and compare. Tedious.

As I said, you can append any data and send it together with the report. The data can be anything, including the binary name - you just verify that the hash matches.

It just makes attestation (more specifically, verification by the remote user) easier and more friendly, as it gives you a couple more fields -- in addition to sgx_report.user_data -- that can be updated and put inside the SGX report.

But how does it make anything better? I just fail to see any value in this feature (as described in this thread, maybe there are other usecases).

dimakuv commented 2 years ago

As I said, you can append any data and send it together with the report. The data can be anything, including the binary name - you just verify that the hash matches.

But this would require some software-defined convention/standard. E.g., Microsoft OpenEnclave and Gramine won't be able to recognize each other's "appended data" unless they adhere to this standard. Surely this can be done, but having a special hardware-defined field like CONFIGID seems like an easier route.

One other thing that we forgot in this discussion is that CONFIGID is also important for SGX sealing feature -- it can be thrown in the key-derivation mix. So this is another important reason to introduce this new field.

lejunzhu commented 2 years ago

The user may want to keep MRENCLAVE to a well known value, in order to collaborate with other Gramine apps via remote attestation.

How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?

Here's an example of what I mean. Let's say a researcher wrote an application to do machine learning using multiple independent data source. The app is distributed as a docker image.

Each contributor of the data will run an instance of the app.
The app will listen on a port. When it gets the partially trained parameter from another instance, it will run the algorithm using its locally stored data and the incoming parameter. The communication is protected with RA TLS, and only those with MRENCLAVE value same as itself is accepted.
There is another port that the contributor can upload the data into the instance and store locally. This is also protected by RA TLS, so the uploader can verify it, and the uploader will authenticate himself using a ordinary public key (like SSH does).
The locally stored data is encrypted with a sealing key derived from MRENCLAVE.

Now everything seems fine, except when two contributors run the app on the same physical machine, they will get the same sealing key, and someone who has access to the machine can copy their files around and generate a new data source mixing their data. This could potentially cause some problem in the algorithm.

Of course, without CONFIGID, there are also a few ways to avoid this. For example, the contributor can download the image , sign the application with a newly generated key, upload the new image and run it. The app can now use MRSIGNER+MRENCLAVE to generate the sealing key. But, with CONFIGID, this is simpler. We can use CONFIGID = hash of the SSH public key, and make the trusted code verify this at startup. Then the sealing key will be different, while the image can be the same.

DL8 commented 2 years ago

How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?

Following the whole discussion above, my conclusion so far is that CONFIGID value cannot be part of the manifest, because we would expect the same binary to support different CONFIGIDs without affecting MRENCLAVE. I would expect some "is KSS required" field in the manifest though. Other fields may be applicable if we want to reserve part of the CONFIGID for Gramine use. I don't see a valid use case for that, so I'll leave it open for discussion.

In general, it is up to enclave developer to define what is a valid/safe CONFIGID value.

In Gramine manifest, we specify a list of binaries/scripts that can be run using fs.mounts. This restricts which apps can run.

With the added KSS support, Gramine sets CONFIGID and CONFIGSVN (and maybe other new SGX fields) to some kind of secure hash of the loaded binary. This reflects the "currently running binary" in Gramine; it is reflected to remote enclaves/users in the CONFIGID and CONFIGSVN fields of the SGX Report.

Even more flexible is if we add a pseudo-file /dev/attestation/configid that the apps can write to. Then a modified Python interpreter, whenever it loads the main Python script, writes to configid to it. Then this reflects the "current running Python script" in Gramine.

CONFIGID cannot be modified after EINIT, so the pseudo-file should be read-only. If I understand your scenario correctly, there is the Python interpreter and several scripts from which one may be executed. In this case, I would expect the loader to initialize CONFIGID with some enum indicating which one of the scripts should be executed. The enclave will read this value and execute the corresponding script.

In general, it's up to the enclave developer to choose how to handle CONFIGID. As I see it, there are two options:

Verify CONFIGID consistency
Use CONFIGID value to determine execution behavior

Note that the two approaches can be combined and it is up to the enclave developer to decide which CONFIGIDs are accepted and how to handle them.

One more interesting use case:

We can use CONFIGID = hash of the SSH public key, and make the trusted code verify this at startup.

This is a very good example, so let's refine it. Assume that the entity running in the enclave has some keypair and the public key is its identity. If the hash of the public key is provided in the CONFIGID, we get the following for free:

In addition to enclave integrity, remote attestation will also attest the identity of the actual entity running
Enclave will be able to derive keys based on its identity, practically preventing secrets from being shared with different entities

To complete the picture, since CONFIGID is controlled entirely by the OS, it enclave must validate the key hash consistency. I would expect a flow similar to this:

OS creates the enclave with the expected public key hash in CONFIGID
OS invokes enclave initialization function, with a wrapped keypair as one of its parameters
Enclave unwraps the key and checks consistency with the hash in CONFIGID
If there is a mismatch, enclave initialization fails

We can further extend this approach and use some bits of CONFIGID to indicate which crypto algorithm is used (e.g. RSA vs. EC). If this approach is implemented, I would expect the enclave to use this value in step 3 to determine how to parse the key.

Some more opens I have in mind:

How will CONFIGID be initialized? It will require changes in Gramine loader, but the question is what is the right way to do it
How is CONFIGSVN managed? I am not sure whether it's entirely up to the enclave developer or Gramine needs to add some support

boryspoplawski commented 2 years ago

One other thing that we forgot in this discussion is that CONFIGID is also important for SGX sealing feature -- it can be thrown in the key-derivation mix. So this is another important reason to introduce this new field.

Ok, this is the real "added value" of this feature, I see now.

Update: on the other hand, since verifying this value still needs to be done by the enclave itself, e.g. that it's indeed a hash of the provided public key, it's not that much different from for example prepending sealed data with the expected key and rejecting non-matching inputs.

dimakuv commented 2 years ago

Thanks @DL8 for the great overview!

Let me summarize the important hardware-enforced properties of the new CONFIGID field:

CONFIGID is a new measurement field that can contain arbitrary data (total 64 bytes). The main purpose of this field is to signal "which specific configuration is used".
CONFIGID is included in the SGX Report, thus it can be analyzed by the remote party during SGX local/remote attestation.
CONFIGID is included in the key-derivation mix for the SGX sealing feature (EGETKEY instruction). I.e., SGX will produce different encryption keys for different CONFIGIDs.
CONFIGID cannot be modified after EINIT. The untrusted runtime assigns arbitrary data to this field. The enclave can only read this field. The enclave may: (1) verify CONFIGID data against some known data, (2) use CONFIGID unconditionally to e.g. choose some in-enclave configuration.

Same applies for CONFIGSVN (though it's a number, not a 64B stream of bytes).

Now what Gramine should do with CONFIGID is still unclear to me:

Gramine can expose CONFIGID in a read-only file /dev/attestation/configid, and delegate the actual CONFIGID-specific logic to the app on top.
Gramine should add a new mode to allow to use CONFIGID for key derivation during EGETKEY. Similar to _sgx_mrenclave in FS mounts. But unclear if this is enough flexibility?
Gramine should use CONFIGID in some very specific way? But no idea in which way exactly.

DL8 commented 2 years ago

Same applies for CONFIGSVN (though it's a number, not a 64B stream of bytes).

CONFIGSVN has the same semantics as ISVSVN. If the enclave requests a key that includes CONFIGID:

A different key is derived per CONFIGSVN
The requested CONFIGSVN is checked. If it's greater than the value with which the enclave was launched, EGETKEY fails

Gramine can expose CONFIGID in a read-only file /dev/attestation/configid, and delegate the actual CONFIGID-specific logic to the app on top.

Is ISVPRODID exposed to the enclave? If it is, perhaps putting CONFIGID next to it is more appropriate. In addition to that, two more fields are introduced with KSS: ISVEXTPRODID and ISVFAMILYID. These two fields are part of SIGSTRUCT. Also, what about SVN values (CPUSVN and ISVSVN)? Are they exposed to the enclave?

Gramine should add a new mode to allow to use CONFIGID for key derivation during EGETKEY. Similar to _sgx_mrenclave in FS mounts. But unclear if this is enough flexibility?

I'm not sure: without KSS, the only key policies are MRSIGNER and MRENCLAVE. With KSS, the following policies are added:

NOISVPRODID - exclude ISVPRODID
CONFIGID - include CONFIGID
ISVFAMILYID - include ISVFAMILYID
ISVEXTPRODID - include ISVEXTPRODID

That potentially gives us 16 options per _sgx_mrenclave and _sgx_mrsigner key. At the moment I don't have enough data and concrete use cases, so I don't know if it's OK to limit the key policy beyond MRSIGNER/MRENCLAVE.

dimakuv commented 2 years ago

A different key is derived per CONFIGSVN

@DL8 Did you make a typo here? Did you mean "...derived per CONFIGID"?

Is ISVPRODID exposed to the enclave? If it is, perhaps putting CONFIGID next to it is more appropriate. In addition to that, two more fields are introduced with KSS: ISVEXTPRODID and ISVFAMILYID. These two fields are part of SIGSTRUCT. Also, what about SVN values (CPUSVN and ISVSVN)? Are they exposed to the enclave?

No, none of these fields are exposed to the enclave (more specifically, to the application that runs on top of Gramine that runs inside the enclave). Nobody yet asked us to add such features.

DL8 commented 2 years ago

A different key is derived per CONFIGSVN

@DL8 Did you make a typo here? Did you mean "...derived per CONFIGID"?

It's not a typo: this statement is about key derivation behavior for a given CONFIGID. The second statement is important: just like ISVSVN and CPUSVNs, an enclave can't ask for keys with newer security versions than what it runs with.

Is ISVPRODID exposed to the enclave? If it is, perhaps putting CONFIGID next to it is more appropriate. In addition to that, two more fields are introduced with KSS: ISVEXTPRODID and ISVFAMILYID. These two fields are part of SIGSTRUCT. Also, what about SVN values (CPUSVN and ISVSVN)? Are they exposed to the enclave?

No, none of these fields are exposed to the enclave (more specifically, to the application that runs on top of Gramine that runs inside the enclave). Nobody yet asked us to add such features.

So I don't see a reason to add dedicated APIs for that. Technically, these fields are exposed via /dev/attestation/report, which I believe is good enough.

dimakuv commented 2 years ago

Ok, so now we all agree that the main benefit of CONFIGID is for local file sealing -- so that only the instance of the enclave that was started with a specific CONFIGID may unseal the previously sealed file. (This security property relies on the correct implementation of the verification of CONFIGID by the SGX enclave itself, e.g., during the Gramine trusted-PAL initialization.)

@DL8 I still don't understand what is the benefit of your "seal the private key with CONFIGID-specific EGETKEY". So the use case is sealing a key (e.g. an enclave-generated long-session private key) with CONFIGID = hash(public-key). Using such sealing guarantees that only the instance of the enclave that was started with the specific public key can unseal and use the private key.

But what's the benefit in this? Say two instances of the same enclave (with the same MRSIGNER and MRENCLAVE) are started and are given this CONFIGID = hash(public-key). Nothing prevents the malicious OS to give the same CONFIGID to two instances. Now each of the enclave instances will unseal the file and verify that the unsealed private key indeed corresponds to the hash found in CONFIGID. So both enclave instances are happy and can use this private key. So what did we achieve?

DL8 commented 2 years ago

@DL8 I still don't understand what is the benefit of your "seal the private key with CONFIGID-specific EGETKEY". So the use case is sealing a key (e.g. an enclave-generated long-session private key) with CONFIGID = hash(public-key). Using such sealing guarantees that only the instance of the enclave that was started with the specific public key can unseal and use the private key.

But what's the benefit in this? Say two instances of the same enclave (with the same MRSIGNER and MRENCLAVE) are started and are given this CONFIGID = hash(public-key). Nothing prevents the malicious OS to give the same CONFIGID to two instances. Now each of the enclave instances will unseal the file and verify that the unsealed private key indeed corresponds to the hash found in CONFIGID. So both enclave instances are happy and can use this private key. So what did we achieve?

It may be worth a separate discussion, because this scenario is not specific to KSS. In general, running the same enclave with the same configuration more than once simultaneously is not necessarily a bug, but it depends on the exact use case. It might be a concern if, for example, the enclave stores blobs out-of-enclave (in files or in memory). In that case, malicious OS may swap valid blobs that were created by A with blobs created by B.

To be fair, I think it would be more appropriate to discuss the benefits of KSS comparing to the current situation:

The exact configuration (in this example the identity of the user) of the enclave is attested with no additional efforts (CONFIGID is part of the target info in EREPORT, so MAC verification will fail if the configuration is inconsistent)
When appropriate, derived keys can be affected by the configuration. In our case, this allows us, for example, to protect files with different keys per user
When appropriate, combined with other fields (e.g. ISVFAMILYID), two different enclaves of the same CONFIGID may be able share keys (e.g. protected files per users that are shared between the two enclaves)

dimakuv commented 2 years ago

We're starting to go in circles... We still have a group of people (myself, @mkow, @boryspoplawski, @BFuhry) that doesn't see the necessity in KSS (other than convenience).

The exact configuration (in this example the identity of the user) of the enclave is attested with no additional efforts (CONFIGID is part of the target info in EREPORT, so MAC verification will fail if the configuration is inconsistent)

Sure, this is useful for convenience. But it is not strictly necessary -- the same guarantees can be achieved by a software-defined protocol, as outlined by @boryspoplawski above. So it's not a real "added value" of KSS.

When appropriate, derived keys can be affected by the configuration. In our case, this allows us, for example, to protect files with different keys per user

I still fail to see how exactly the files-per-user are protected with KSS. The CONFIGID is set up by the possibly-malicious host, and it seems that the SGX enclave doesn't have a "golden reference value" to compare this possibly-malicious CONFIGID against anything user-bound.

DL8 commented 2 years ago

I still fail to see how exactly the files-per-user are protected with KSS. The CONFIGID is set up by the possibly-malicious host, and it seems that the SGX enclave doesn't have a "golden reference value" to compare this possibly-malicious CONFIGID against anything user-bound.

Let's assume that the enclave checks for CONFIGID consistency (in this case, that the user key's hash matches the hash in CONFIGID). In this case, host can't forge CONFIGID value to read files of an arbitrary user without having their key.

Another added value is that different sealing keys are derived for each user. Therefore, if there is a vulnerability in the enclave that leaks the sealing key in use:

Without KSS, the same key is derived for all users. Therefore, all files are at risk
With KSS, a different key is derived per CONFIGID (== user), and therefore only the same user's files are at risk. To leak all files, attacker would have to leak the sealing key for each user separately, which requires more work and user keys may not be available on the system (or their usage may be limited by other means)

lejunzhu commented 2 years ago

I still fail to see how exactly the files-per-user are protected with KSS. The CONFIGID is set up by the possibly-malicious host, and it seems that the SGX enclave doesn't have a "golden reference value" to compare this possibly-malicious CONFIGID against anything user-bound.

Here's a very simple example: let's say a CSP is hosting a file upload/download service backed by SGX. I can upload my own client certificate and create an instance of it.

Every time the enclave starts, the enclave code checks hash(allowed client cert) == CONFIGID. The allowed cert is a plain text file anyone can change.
Every time I use the service (e.g. with RA TLS), I check CONFIGID == hash(my client cert).
Everything I upload will be encrypted with the sealing key derived from MRENCLAVE+CONFIGID.

And that's it. Malicious host can change the CONFIGID to any value, but the file I have uploaded can't be decrypted unless CONFIGID == hash(my client cert). If the CONFIGID is set as such, the enclave will only accept connections from me, because only I have the corresponding private key. If there are two enclaves both having my CONFIGID, they will behave the same, and my files remain secret.

dimakuv commented 2 years ago

@lejunzhu Thank you!

I finally understand the usefulness of KSS. The missing ingridient for me was Step 2 above -- the remote user verifies CONFIGID to be a known golden reference value, and refuses to talk with the remote enclave if the CONFIGID is not expected.

So KSS is useful to keep secrets sealed per user (by throwing CONFIGID into the EGETKEY derivation material), and the user must verify the correctness of CONFIGID during SGX remote attestation.

DL8 commented 2 years ago

I think at this point the easy part is to add support for new build-time attributes of the enclave (ISVEXTPRODID and ISVFAMILYID). For the implementaiton, I suggest adding the following attributes to the manifest:

sgx.kss = [true|false] (default: false)
sgx.isvextprodid = "[16-byte hex value]"  (default: "0x00000000000000000000000000000000")
sgx.isvfamily    = "[16-byte hex value]"  (default: "0x00000000000000000000000000000000")

SGX signing tool will be modified as follows:

If sgx.kss = true:
1. Add the new fields to the SIGSTRUCT
2. Set ATTRIBUTES.FIELDS.KSS = 1 (bit 7)
Else, use legacy flow

SGX loader may have to change to initialize the enclave with KSS if needed.

One open question about this change is how to fix RA-TLS verification callback. Current signature is as follows:

typedef int (*verify_measurements_cb_t)(const char* mrenclave, const char* mrsigner,
                                        const char* isv_prod_id, const char* isv_svn);

To support KSS properly, 4 additional arguments will have to be added: ISVEXTPRODID, ISVFAMILYID, CONFIGID and CONFIGSVN. In order to be backwards compatible, a new callback for KSS verification with the additional arguments will have to be added and invoked if KSS bit is set in the report's attributes. My concern with this approach is with maintainability and future compatibility: what if additional fields are added? Will an additional callback have to be defined, having its own setter and additional logic in Gramine to choose the correct one? I can see two possible mitigations for this concern:

Pass the report as is instead of selected fields (might be slightly more cumbersome for the user, but reduces the likelihood that more arguments will be needed)
Add a struct with additional arguments to the callback, such that new fields will be appended to it as needed. In this case developer will have to handle struct versioning (in case fields were added, but the app was built with an older version that doesn't have them), but this leaves us with a single callback definition even if new fields are needed