Open DL8 opened 2 years ago
The same enclave binary may be executed in the context of two different processes (they might even belong to different users). They will have the same MRSIGNER and MRENCLAVE, so without they will derive the same keys, which might lead to unintended data disclosure
I don't think this is the case in Gramine model:
execve()
from "Gramine ver XYZ+binary QWER" to "Gramine ver XYZ+binary ASDF" then it may seem that the current identity is really "Gramine ver XYZ+binary ASDF", while in fact QWER could make persistent changes inside the enclave and still be in control. This is because Gramine doesn't provide in-enclave sandboxing (this is technically infeasible on SGX).A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)
This sounds more relevant to Gramine, but why wouldn't the config be placed in the manifest? Could you describe some real-world use case for this?
An application may consist of several enclaves ("microservices"), which may want to share keys (e.g. for shared protected storage)
But they already can, via our remote attestation APIs? This seems unrelated to KSS, it's just about attestation?
The same enclave binary may be executed in the context of two different processes (they might even belong to different users). They will have the same MRSIGNER and MRENCLAVE, so without they will derive the same keys, which might lead to unintended data disclosure
A well-written enclave (app) shall not disclose any confidential data. Well, how to guarantee an enclave (app) is correct (w/o violating any security/safety gurantees) is an orthogonal and hard problem. So for this case, even they're the very same enclaves, they should be able to unseal but cannot lead to data disclosure as long as they're well-written/correct.
- The true "identity" of the enclave will always be its MRENCLAVE in Gramine model. If we change the identity on
execve()
from "Gramine ver XYZ+binary QWER" to "Gramine ver XYZ+binary ASDF" then it may seem that the current identity is really "Gramine ver XYZ+binary ASDF", while in fact QWER could make persistent changes inside the enclave and still be in control. This is because Gramine doesn't provide in-enclave sandboxing (this is technically infeasible on SGX).
Yes, but shouldn't the identity be "Gramine ver XYZ+binary QWER+binary ASDF"? The design rationale here of KSS (specifically CONFIGID
in this case) is to indicate/reflect what post-EINIT additional content may be accepted rather than removing "binary QWER" out of TCB.
A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)
This sounds more relevant to Gramine, but why wouldn't the config be placed in the manifest? Could you describe some real-world use case for this?
AFAIRecall, it's targetting dynamic loading of code/data after enclave initialization. One of the envisioned typical scenarios is for language runtimes where the runtime is built, distributed and deployed as a single enclave binary while allowing end users to attach the config and modules it loads more easily. It's kind of for separation and composition. Together w/ CONFIGSVN
it's easier for security updates.
An application may consist of several enclaves ("microservices"), which may want to share keys (e.g. for shared protected storage)
But they already can, via our remote attestation APIs? This seems unrelated to KSS, it's just about attestation?
I guess @DL8's intention here was to talk about the use case of ISVFAMILYID
and ISVEXTPRODID
(together w/ KEYREQUEST
and KEYPOLICY
) for sealing key derivation. This allows more flexible sealing key sharing schemes (thus for shared protected storage) on a local platform.
A remote party may want to attest not only which binary is run, but also which specific configuration is used (CONFIGID and CONFIGSVN)
This sounds more relevant to Gramine, but why wouldn't the config be placed in the manifest? Could you describe some real-world use case for this?
I can think of a few:
When the user gets a pre-built docker image, instead of building his own app. The user may want to keep MRENCLAVE to a well known value, in order to collaborate with other Gramine apps via remote attestation.
How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?
Another case is the config will include MRENCLAVE values of more than one apps, therefore it can only be generated after building the app.
But in Gramine MRENCLAVE
reflects not only the app, but also Gramine itself and manifest (and hence initial filesystem state).
How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?
I guess the logic will be something like this:
fs.mounts
. This restricts which apps can run.CONFIGID
and CONFIGSVN
(and maybe other new SGX fields) to some kind of secure hash of the loaded binary. This reflects the "currently running binary" in Gramine; it is reflected to remote enclaves/users in the CONFIGID
and CONFIGSVN
fields of the SGX Report./dev/attestation/configid
that the apps can write to. Then a modified Python interpreter, whenever it loads the main Python script, writes to configid
to it. Then this reflects the "current running Python script" in Gramine.So I think Gramine manifest already has the means to restrict the apps safe to run. And with KSS, we have an interesting new property of "SGX attestation reflects the currently running binary/script".
With the added KSS support, Gramine sets CONFIGID and CONFIGSVN (and maybe other new SGX fields) to some kind of secure hash of the loaded binary. This reflects the "currently running binary" in Gramine; it is reflected to remote enclaves/users in the CONFIGID and CONFIGSVN fields of the SGX Report.
What's the point? You can read /proc/pid/cmdline
or something inside Gramine to get the current running app. Also you probably know already which app is currently running, so I really don't see a point.
So I think Gramine manifest already has the means to restrict the apps safe to run. And with KSS, we have an interesting new property of "SGX attestation reflects the currently running binary/script".
But what's the point? This cannot ever be a security feature - all the code (apps) running must trust each other anyway.
ou can read /proc/pid/cmdline or something inside Gramine to get the current running app. Also you probably know already which app is currently running, so I really don't see a point.
I'm not talking about the Gramine enclave itself. But rather about the remote parties and what information they can learn about the Gramine enclave based on the SGX Report received.
But the code handling remove attestation could just handle this - it knows when / in what app it runs
But the code handling remove attestation could just handle this - it knows when / in what app it runs
What are we talking about? Sorry, I am confused by "could just handle this" -- handle what exactly? The SGX Report is an Intel SGX Hardware defined struct, it's not a software construct. You cannot just add some fields to it as you wish.
What are we talking about?
You said the purpose of this feature is to know the "currently running binary" in Gramine. My point is that this is not a security feature (as explained before) and if you know all inputs that make the final MRENCLAVE (which you probably have to, if you trust it), then you already know what program is running inside Gramine.
You cannot just add some fields to it as you wish.
You can have any data appended (possibly encrypted) and embed hash of it in user data.
You can have any data appended (possibly encrypted) and embed hash of it in user data.
But how are you gonna check it? We currently use sgx_report.user_data
for the hash of the public key of the ephemeral keypair generated inside of the SGX enclave (for RA-TLS purposes). If we put a hash of the "currently running binary" in this mix, the remote party will have a really hard time figuring out with which binary (out of a set of e.g. 100 possible binaries) it talks -- it will need to iterate through all 100 binaries, mix them with a public key of RA-TLS cert, calculate the hash and compare. Tedious.
My point is that this is not a security feature (as explained before)...
I guess I agree with this. It doesn't really add more security. It just makes attestation (more specifically, verification by the remote user) easier and more friendly, as it gives you a couple more fields -- in addition to sgx_report.user_data
-- that can be updated and put inside the SGX report.
But how are you gonna check it? We currently use sgx_report.user_data for the hash of the public key of the ephemeral keypair generated inside of the SGX enclave (for RA-TLS purposes). If we put a hash of the "currently running binary" in this mix, the remote party will have a really hard time figuring out with which binary (out of a set of e.g. 100 possible binaries) it talks -- it will need to iterate through all 100 binaries, mix them with a public key of RA-TLS cert, calculate the hash and compare. Tedious.
As I said, you can append any data and send it together with the report. The data can be anything, including the binary name - you just verify that the hash matches.
It just makes attestation (more specifically, verification by the remote user) easier and more friendly, as it gives you a couple more fields -- in addition to sgx_report.user_data -- that can be updated and put inside the SGX report.
But how does it make anything better? I just fail to see any value in this feature (as described in this thread, maybe there are other usecases).
As I said, you can append any data and send it together with the report. The data can be anything, including the binary name - you just verify that the hash matches.
But this would require some software-defined convention/standard. E.g., Microsoft OpenEnclave and Gramine won't be able to recognize each other's "appended data" unless they adhere to this standard. Surely this can be done, but having a special hardware-defined field like CONFIGID
seems like an easier route.
One other thing that we forgot in this discussion is that CONFIGID
is also important for SGX sealing feature -- it can be thrown in the key-derivation mix. So this is another important reason to introduce this new field.
The user may want to keep MRENCLAVE to a well known value, in order to collaborate with other Gramine apps via remote attestation.
How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?
Here's an example of what I mean. Let's say a researcher wrote an application to do machine learning using multiple independent data source. The app is distributed as a docker image.
Now everything seems fine, except when two contributors run the app on the same physical machine, they will get the same sealing key, and someone who has access to the machine can copy their files around and generate a new data source mixing their data. This could potentially cause some problem in the algorithm.
Of course, without CONFIGID, there are also a few ways to avoid this. For example, the contributor can download the image , sign the application with a newly generated key, upload the new image and run it. The app can now use MRSIGNER+MRENCLAVE to generate the sealing key. But, with CONFIGID, this is simpler. We can use CONFIGID = hash of the SSH public key, and make the trusted code verify this at startup. Then the sealing key will be different, while the image can be the same.
How would would we decide which apps are ok/safe to run, if they are not reflected in manifest (hence MRENCLAVE)?
Following the whole discussion above, my conclusion so far is that CONFIGID value cannot be part of the manifest, because we would expect the same binary to support different CONFIGIDs without affecting MRENCLAVE. I would expect some "is KSS required" field in the manifest though. Other fields may be applicable if we want to reserve part of the CONFIGID for Gramine use. I don't see a valid use case for that, so I'll leave it open for discussion.
In general, it is up to enclave developer to define what is a valid/safe CONFIGID value.
- In Gramine manifest, we specify a list of binaries/scripts that can be run using
fs.mounts
. This restricts which apps can run.- With the added KSS support, Gramine sets
CONFIGID
andCONFIGSVN
(and maybe other new SGX fields) to some kind of secure hash of the loaded binary. This reflects the "currently running binary" in Gramine; it is reflected to remote enclaves/users in theCONFIGID
andCONFIGSVN
fields of the SGX Report.- Even more flexible is if we add a pseudo-file
/dev/attestation/configid
that the apps can write to. Then a modified Python interpreter, whenever it loads the main Python script, writes toconfigid
to it. Then this reflects the "current running Python script" in Gramine.
CONFIGID cannot be modified after EINIT, so the pseudo-file should be read-only. If I understand your scenario correctly, there is the Python interpreter and several scripts from which one may be executed. In this case, I would expect the loader to initialize CONFIGID with some enum indicating which one of the scripts should be executed. The enclave will read this value and execute the corresponding script.
In general, it's up to the enclave developer to choose how to handle CONFIGID. As I see it, there are two options:
Note that the two approaches can be combined and it is up to the enclave developer to decide which CONFIGIDs are accepted and how to handle them.
One more interesting use case:
We can use CONFIGID = hash of the SSH public key, and make the trusted code verify this at startup.
This is a very good example, so let's refine it. Assume that the entity running in the enclave has some keypair and the public key is its identity. If the hash of the public key is provided in the CONFIGID, we get the following for free:
To complete the picture, since CONFIGID is controlled entirely by the OS, it enclave must validate the key hash consistency. I would expect a flow similar to this:
We can further extend this approach and use some bits of CONFIGID to indicate which crypto algorithm is used (e.g. RSA vs. EC). If this approach is implemented, I would expect the enclave to use this value in step 3 to determine how to parse the key.
Some more opens I have in mind:
One other thing that we forgot in this discussion is that CONFIGID is also important for SGX sealing feature -- it can be thrown in the key-derivation mix. So this is another important reason to introduce this new field.
Ok, this is the real "added value" of this feature, I see now.
Update: on the other hand, since verifying this value still needs to be done by the enclave itself, e.g. that it's indeed a hash of the provided public key, it's not that much different from for example prepending sealed data with the expected key and rejecting non-matching inputs.
Thanks @DL8 for the great overview!
Let me summarize the important hardware-enforced properties of the new CONFIGID field:
EGETKEY
instruction). I.e., SGX will produce different encryption keys for different CONFIGIDs.Same applies for CONFIGSVN (though it's a number, not a 64B stream of bytes).
Now what Gramine should do with CONFIGID is still unclear to me:
/dev/attestation/configid
, and delegate the actual CONFIGID-specific logic to the app on top.EGETKEY
. Similar to _sgx_mrenclave
in FS mounts. But unclear if this is enough flexibility?Same applies for CONFIGSVN (though it's a number, not a 64B stream of bytes).
CONFIGSVN
has the same semantics as ISVSVN
. If the enclave requests a key that includes CONFIGID
:
CONFIGSVN
CONFIGSVN
is checked. If it's greater than the value with which the enclave was launched, EGETKEY
fails
- Gramine can expose CONFIGID in a read-only file
/dev/attestation/configid
, and delegate the actual CONFIGID-specific logic to the app on top.
Is ISVPRODID
exposed to the enclave? If it is, perhaps putting CONFIGID
next to it is more appropriate. In addition to that, two more fields are introduced with KSS: ISVEXTPRODID
and ISVFAMILYID
. These two fields are part of SIGSTRUCT.
Also, what about SVN values (CPUSVN
and ISVSVN
)? Are they exposed to the enclave?
- Gramine should add a new mode to allow to use CONFIGID for key derivation during
EGETKEY
. Similar to_sgx_mrenclave
in FS mounts. But unclear if this is enough flexibility?
I'm not sure: without KSS, the only key policies are MRSIGNER and MRENCLAVE. With KSS, the following policies are added:
NOISVPRODID
- exclude ISVPRODID
CONFIGID
- include CONFIGID
ISVFAMILYID
- include ISVFAMILYID
ISVEXTPRODID
- include ISVEXTPRODID
That potentially gives us 16 options per _sgx_mrenclave
and _sgx_mrsigner
key. At the moment I don't have enough data and concrete use cases, so I don't know if it's OK to limit the key policy beyond MRSIGNER/MRENCLAVE.
- A different key is derived per
CONFIGSVN
@DL8 Did you make a typo here? Did you mean "...derived per CONFIGID
"?
Is
ISVPRODID
exposed to the enclave? If it is, perhaps puttingCONFIGID
next to it is more appropriate. In addition to that, two more fields are introduced with KSS:ISVEXTPRODID
andISVFAMILYID
. These two fields are part of SIGSTRUCT. Also, what about SVN values (CPUSVN
andISVSVN
)? Are they exposed to the enclave?
No, none of these fields are exposed to the enclave (more specifically, to the application that runs on top of Gramine that runs inside the enclave). Nobody yet asked us to add such features.
- A different key is derived per
CONFIGSVN
@DL8 Did you make a typo here? Did you mean "...derived per
CONFIGID
"?
It's not a typo: this statement is about key derivation behavior for a given CONFIGID
. The second statement is important: just like ISVSVN
and CPUSVN
s, an enclave can't ask for keys with newer security versions than what it runs with.
Is
ISVPRODID
exposed to the enclave? If it is, perhaps puttingCONFIGID
next to it is more appropriate. In addition to that, two more fields are introduced with KSS:ISVEXTPRODID
andISVFAMILYID
. These two fields are part of SIGSTRUCT. Also, what about SVN values (CPUSVN
andISVSVN
)? Are they exposed to the enclave?No, none of these fields are exposed to the enclave (more specifically, to the application that runs on top of Gramine that runs inside the enclave). Nobody yet asked us to add such features.
So I don't see a reason to add dedicated APIs for that. Technically, these fields are exposed via /dev/attestation/report
, which I believe is good enough.
Ok, so now we all agree that the main benefit of CONFIGID
is for local file sealing -- so that only the instance of the enclave that was started with a specific CONFIGID
may unseal the previously sealed file. (This security property relies on the correct implementation of the verification of CONFIGID
by the SGX enclave itself, e.g., during the Gramine trusted-PAL initialization.)
@DL8 I still don't understand what is the benefit of your "seal the private key with CONFIGID-specific EGETKEY". So the use case is sealing a key (e.g. an enclave-generated long-session private key) with CONFIGID = hash(public-key)
. Using such sealing guarantees that only the instance of the enclave that was started with the specific public key can unseal and use the private key.
But what's the benefit in this? Say two instances of the same enclave (with the same MRSIGNER and MRENCLAVE) are started and are given this CONFIGID = hash(public-key)
. Nothing prevents the malicious OS to give the same CONFIGID to two instances. Now each of the enclave instances will unseal the file and verify that the unsealed private key indeed corresponds to the hash found in CONFIGID. So both enclave instances are happy and can use this private key. So what did we achieve?
@DL8 I still don't understand what is the benefit of your "seal the private key with CONFIGID-specific EGETKEY". So the use case is sealing a key (e.g. an enclave-generated long-session private key) with
CONFIGID = hash(public-key)
. Using such sealing guarantees that only the instance of the enclave that was started with the specific public key can unseal and use the private key.But what's the benefit in this? Say two instances of the same enclave (with the same MRSIGNER and MRENCLAVE) are started and are given this
CONFIGID = hash(public-key)
. Nothing prevents the malicious OS to give the same CONFIGID to two instances. Now each of the enclave instances will unseal the file and verify that the unsealed private key indeed corresponds to the hash found in CONFIGID. So both enclave instances are happy and can use this private key. So what did we achieve?
It may be worth a separate discussion, because this scenario is not specific to KSS. In general, running the same enclave with the same configuration more than once simultaneously is not necessarily a bug, but it depends on the exact use case. It might be a concern if, for example, the enclave stores blobs out-of-enclave (in files or in memory). In that case, malicious OS may swap valid blobs that were created by A with blobs created by B.
To be fair, I think it would be more appropriate to discuss the benefits of KSS comparing to the current situation:
CONFIGID
is part of the target info in EREPORT
, so MAC verification will fail if the configuration is inconsistent)ISVFAMILYID
), two different enclaves of the same CONFIGID
may be able share keys (e.g. protected files per users that are shared between the two enclaves)We're starting to go in circles... We still have a group of people (myself, @mkow, @boryspoplawski, @BFuhry) that doesn't see the necessity in KSS (other than convenience).
- The exact configuration (in this example the identity of the user) of the enclave is attested with no additional efforts (CONFIGID is part of the target info in EREPORT, so MAC verification will fail if the configuration is inconsistent)
Sure, this is useful for convenience. But it is not strictly necessary -- the same guarantees can be achieved by a software-defined protocol, as outlined by @boryspoplawski above. So it's not a real "added value" of KSS.
- When appropriate, derived keys can be affected by the configuration. In our case, this allows us, for example, to protect files with different keys per user
I still fail to see how exactly the files-per-user are protected with KSS. The CONFIGID
is set up by the possibly-malicious host, and it seems that the SGX enclave doesn't have a "golden reference value" to compare this possibly-malicious CONFIGID
against anything user-bound.
I still fail to see how exactly the files-per-user are protected with KSS. The
CONFIGID
is set up by the possibly-malicious host, and it seems that the SGX enclave doesn't have a "golden reference value" to compare this possibly-maliciousCONFIGID
against anything user-bound.
Let's assume that the enclave checks for CONFIGID
consistency (in this case, that the user key's hash matches the hash in CONFIGID
). In this case, host can't forge CONFIGID
value to read files of an arbitrary user without having their key.
Another added value is that different sealing keys are derived for each user. Therefore, if there is a vulnerability in the enclave that leaks the sealing key in use:
CONFIGID
(== user), and therefore only the same user's files are at risk. To leak all files, attacker would have to leak the sealing key for each user separately, which requires more work and user keys may not be available on the system (or their usage may be limited by other means)I still fail to see how exactly the files-per-user are protected with KSS. The
CONFIGID
is set up by the possibly-malicious host, and it seems that the SGX enclave doesn't have a "golden reference value" to compare this possibly-maliciousCONFIGID
against anything user-bound.
Here's a very simple example: let's say a CSP is hosting a file upload/download service backed by SGX. I can upload my own client certificate and create an instance of it.
And that's it. Malicious host can change the CONFIGID to any value, but the file I have uploaded can't be decrypted unless CONFIGID == hash(my client cert). If the CONFIGID is set as such, the enclave will only accept connections from me, because only I have the corresponding private key. If there are two enclaves both having my CONFIGID, they will behave the same, and my files remain secret.
@lejunzhu Thank you!
I finally understand the usefulness of KSS. The missing ingridient for me was Step 2 above -- the remote user verifies CONFIGID
to be a known golden reference value, and refuses to talk with the remote enclave if the CONFIGID
is not expected.
So KSS is useful to keep secrets sealed per user (by throwing CONFIGID
into the EGETKEY derivation material), and the user must verify the correctness of CONFIGID
during SGX remote attestation.
I think at this point the easy part is to add support for new build-time attributes of the enclave (ISVEXTPRODID
and ISVFAMILYID
). For the implementaiton, I suggest adding the following attributes to the manifest:
sgx.kss = [true|false] (default: false)
sgx.isvextprodid = "[16-byte hex value]" (default: "0x00000000000000000000000000000000")
sgx.isvfamily = "[16-byte hex value]" (default: "0x00000000000000000000000000000000")
SGX signing tool will be modified as follows:
sgx.kss = true
:
SGX loader may have to change to initialize the enclave with KSS if needed.
One open question about this change is how to fix RA-TLS verification callback. Current signature is as follows:
typedef int (*verify_measurements_cb_t)(const char* mrenclave, const char* mrsigner,
const char* isv_prod_id, const char* isv_svn);
To support KSS properly, 4 additional arguments will have to be added: ISVEXTPRODID
, ISVFAMILYID
, CONFIGID
and CONFIGSVN
. In order to be backwards compatible, a new callback for KSS verification with the additional arguments will have to be added and invoked if KSS bit is set in the report's attributes.
My concern with this approach is with maintainability and future compatibility: what if additional fields are added? Will an additional callback have to be defined, having its own setter and additional logic in Gramine to choose the correct one?
I can see two possible mitigations for this concern:
KSS (key separation and sharing) provides additional fields to SECS and EGETKEY, which allows more fine-grained control over remote attestation and key derivation. Some examples where it might be useful:
Several places where changes may be required (not a complete list and open for discussion):