Open ajay-fuji opened 6 months ago
Hi Ajay,
We're glad you like this project. We do not support Intel TDX or ARM Realm CC yet. We do, however, have a project called BlindLlama that uses a minimal OS we developed to securely deploy AI (like a virtual enclave). It uses a TPM to measure and establish trust.
We're working on integrating AMD SEV-SNP support for our confidential-AI solution soon.
Hi Ajay,
@ShannonSD covered most of the key points. I'll add a couple more details:
You can find the BlindLlama whitepaper here: BlindLlama Whitepaper. The whitepaper is a good starting point if you want to know more (it does not talk about the plan for AMD SEV SNP support though). Also despite the name the BlindLlama approach can be used to serve not just LLM models, but any ML model.
Additionally, regarding BlindLlama, our upcoming AMD SEV-SNP support will include support for Nvidia Hopper Confidential Computing mode (which can be used on the Nvidia H100). We are also interested in supporting Intel TDX once we have added support for AMD SEV SNP. We are not considering ARM Realm CC at this point.
Thanks for your response. I have few more questions: why two projects, BlindAI and BlindLLama, since both serve relatively the same purpose, any future plan for the consolidation of the two into one? And is TPM supported by all CPUs by default? And what is the future plan of BlindAI in terms of SGX, Will it prevail or be replaced by TDX support?
Hi @clauverjat Thank you for providing details, do you have any specific reasons to not considering ARM CCA?
Just for reference, as per the Medium article on CC ARM CCA support secure DMA as well it is open-source.
Hi Ajay,
Why two projects, BlindAI and BlindLlama, since both serve relatively the same purpose?
That’s a good question. Indeed, both solutions perform inference in trusted execution environments. However, they originated from different constraints and demands. To understand why we decided to start a new project named BlindLlama. One needs to understand what BlindAI is about. BlindAI at its core is an inference server tailored specifically for SGX. Because SGX is a process-based TEE, SGX required us (developers) to target its environment specifically since the trust boundary does not include an OS. The fact an OS is not included in the TCB is good in terms of security since it shrinks the TCB by a lot, but it is what made Intel SGX adoption so challenging. Due to this constraint, BlindAI (Core) comes with some limitations: our inference engine only supports models in ONNX format, and not all ONNX operators are supported. Also in terms of performance, because an SGX enclave cannot interact with GPUs, we are limited to CPU compute. This might not have been a big deal a few years ago (and depending on your use case it might still be okay), but these days with the trend in ML being towards bigger and bigger models, not being able to use GPU is a serious drawback. Contrast that with Confidential VM. CVM enables a lift-and-shift approach where a regular application can be deployed in a confidential VM with minimal to no change. As such most of the work to use Confidential VM is in implementing the required change in the guest OS. Because of this, it makes more sense to include an existing well-known inference server instead of designing one from scratch like we did for SGX. But then why limit our users to use the inference server that we chose for them? Why not let them use the one they are used to? Why be limited to ONNX models?
That’s why we launched a new project : BlindLlama. For this project, the goal was to be able to do LLM inference and we focused on doing just that. That being said, technically there is nothing that prevents us from taking our existing BlindAI server for SGX, modifying it a little, and using it in a CVM to have an interface compatible with BlindAI, but we think that there is not much value in that.
Any future plan for the consolidation of the two into one?
No, we don’t plan on consolidating the two projects.
Is TPM supported by all CPUs by default?
Originally a TPM is a separate chip. Its purpose is to act as a root of trust for the system. Most modern servers come with a TPM chip. But the TPM comes in all sorts of shapes. For instance, AMD CPUs provides a firmware TPM (fTPM), which integrates the TPM functionality directly within the CPU. In virtualized environments, the TPM can be provided by the hypervisor (like the hypervisor can provide other virtual devices), in that case it is referred to as a vTPM. BlindLlama supports Azure Trusted Launch VMs, so the TPM in use is a vTPM. But the TPM approach can be employed in a large number of environments.
And what is the future plan of BlindAI in terms of SGX, Will it prevail or be replaced by TDX support?
First I want to point out a common misconception. You stated in your first message that “SGX is already deprecated in the 11th and 12th generations of Intel's CPUs”. This is misleading. When Intel introduced SGX they included it in both their desktop and server CPUs lines. What Intel did was remove SGX from their desktop CPU line (Intel Core). But they will continue to include Intel SGX in their server CPU line (Intel Xeon). Actually, TDX (which is only being included in server CPUs) relies on SGX for its operations. Thus TDX implies SGX.
So, Intel TDX should very much be viewed as a similar but complementary technology to SGX rather than a replacement. TDX provides isolation at the virtual machine level, whereas SGX offers application-level enclaves. Whether you want one or the other depends on the specifics of your use case.
Now regarding BlindAI, we plan work depending on what our clients want. Recently BlindAI has not received a lot of interest, so we paused development. But of course if one client had a good use case for it, and wanted us to support it. We'll resume work on it. For now though we focus on BlindLlama and CVM technology support. I do expect most people to go for the CVM or TPM approach since it is more flexible. But that does not mean that there is no room for BlindAI (and Intel SGX). BlindAI will be a better choice if you have an application that demands higher security guarantees.
Hi Darshan,
Thanks for the Medium article, it is interesting.
do you have any specific reasons to not considering ARM CCA?
I don't think there is anything wrong with ARM CCA if that's what you are asking. We just need to focus our efforts. Our clients have manifested interest for AMD SEV SNP and Intel TDX, but they haven't manifested interest for ARM CCA yet. I think it is likely due to the market share of ARM in the server CPU market : most server CPU are still Intel or AMD CPUs (Though usage of ARM on the server side is growing). Also (but this is related), my team has not yet tried ARM CCA. On the other hand, we've started work on AMD SEV SNP support, and we've experimented with Intel TDX. Before adding it to the roadmap, we'll need to experiment with it.
That being said I really like the fact that ARM CCA is open source (this is great in terms of transparency). So we'd be interested in exploring the technology at some point
Thank you @clauverjat for you detailed updates and plan, we will explore on BlindLLM and continue to discuss more on same.
Hi Team,
Thanks for this work and making it open source.
I have a question regarding support for Intel's TDX, AMD's SEV-SNP, and ARM Realm CC in BlindAI. Is there any plan in place to include these new technologies since SGX is already deprecated in the 11th and 12th generations of Intel's CPUs?