ultravioletrs / cocos

Cocos AI - Confidential Computing System for AI
https://ultraviolet.rs/cocos.html
Apache License 2.0
25 stars 9 forks source link

Feature: Reconstruct our Buildroot image to follow Google Hardened Image #264

Open drasko opened 1 month ago

drasko commented 1 month ago

Is your feature request related to a problem? Please describe.

No

Describe the feature you are requesting, as well as the possible use case(s) for it.

We should save RAM only for execution (as it is limited). All other artifacts (Linux files, algrotihm, Agent and other binaries, downloaded Docker images, Pytorch libraries, etc...) should live on disk.

Google proposes following: https://cloud.google.com/docs/security/confidential-space

  1. Root FS - probably ext4 - OS artifacts on the disk, including Agent. This one should be immutable (read-only) - no one should ever change Agent or OS files.
  2. Mutable (read-write) disk partition, but this one then must be encrypted, as it will contain downloaded Docker images or algorithm binaries and potentially also datasets. Result can be written there.
  3. tmpfs - this is the one in which execution must be done, so that we guarantee that it is in RAM.

An additional research needs to be done on this, but those changes make sense in order to better optimize RAM usage and protect Agent and other artifacts further (immutable partition), protect downloaded algo (when not put in RAM - foroptimization - Docker image needs to we written in encrypted mutable partition) and result, and ensure execution in tmpfs without swap.

Indicate the importance of this feature to you.

Must-have

Anything else?

No response

drasko commented 1 month ago

@danko-miladinovic @SammyOina @rodneyosodo can you please verify this:

image

If this is true, and it is like this for Intel TDX also, then we do not have to disable swap. Moreover - we should enable swap and use it, in order to preserve RAM.

danko-miladinovic commented 1 month ago

Yes, based on the paper SEV-SNP formal security analysis the pages are encrypted when they are swapped to disk.

dborovcanin commented 1 month ago

@drasko What are the benefits of this approach? We need to put most of it into RAM to make it run anyway, and adding disk support is quite a significant change that requires disk encryption. @danko-miladinovic I don't think we are talking about process pages, but everything else that is not part of the process, but the system.

drasko commented 1 month ago

There are many benefits. For example swap. Then - Linux system can be on RO unencrypted (no jeed to put everything in RAM). And so on. Also big data files that can be treated one by one.