ut-osa / ryoan

7 stars 2 forks source link

Ryoan: A distributed sandbox for untrusted computation on secret data

Data-processing services are widely available on the Internet. Individual users can conveniently access them for tasks including image editing (e.g., Pixlr), tax preparation (e.g., TurboTax), data analytics (e.g., SAS OnDemand) and even personal health analysis (23andMe). However, user inputs to such services are often sensitive, such as tax documents and health data, which creates a dilemma for the user. In order to leverage the convenience and expertise of these services, the user must disclose sensitive data. If the user wants to keep their data secret, they either have to give up using the services or hope that they can be trusted---that their service software will not leak data.

Ryoan, is a distributed sandbox that forces data-processing services to keep user data secret, without trusting the service's software stack. Ryoan provides a sandbox to confine individual data-processing modules and prevent them from leaking data. Then it uses trusted hardware to allow a remote user to verify the integrity of individual sandbox instances and protect their execution.


A key enabling technology for Ryoan is hardware enclave-protected execution (e.g., Intel's software guard extensions (SGX)), a hardware primitive that uses trusted hardware to protect a user-level computation from potentially malicious privileged software.

Ryoan's security goal is simple: prevent leakage of secret data. However, confining services over which the user has no control is challenging without a centralized trusted platform. We make the following contributions:


Ryoan confines a directed acyclic graph of communicating modules. Each module is a piece of application logic that processes user data while managing its own secrets.

A single instance of the Ryoan sandbox:

a single Ryoan instance

Ryoan uses a system of labels to track the data stakeholders of messages as they travel from instance to instance. Messages are encrypted so that only other Ryoan instances can decrypt them. Steakholders are only allowed remove their own lables, and output messages are keep all un-removed labels from the input making it possible to delegate computation to modules outside of their control. Ryoan will only send completely unlabeld messages to users, so a provider can output a labeld message that contains proider secrets with the assurance that the response must pass through another of their moduled before it can be communicated outside of the dsitributed sandbox.

An example where 23AndMe delegates work to Amazon, then filters the results making sure thye are clean of 23AndMe's secrets before sending the final response to the user:

a single Ryoan instance

Ryoan uses many tecchniques to confine applicationsd while remaining programable. for instance:

Please take a look at our publications for more details about the design:

Ryoan is based on Google's Native Client. Native Client (NaCl) is a software sandbox that allows Ryoan to confine untrusted code. While NaCl is usually connected to the Chrome browser it has been modified here to run as a standalone process.

Ryoan is designed to run in SGX, but this prototype does not. Key features of Ryoan such as checkpoint restore depend on SGX version 2 capabilities which have yet to be released. To support execution in SGX Ryoan links NaCl to a modified version of eglibc. Eglibc has been augmented here with marshalling code for all system calls (take a look at eglibc/sgx_syscall_interpos).

Project Structure

Prerequisites

Configure

Run J=${Jobs} ./bootstrap.sh in the root directory. This will unpack ryoan_env.tar.xz and recursively configure other directories. This will also build eglibc since it is required to configure other parts of Ryoan. The environment variable J is passed directly to the make call for eglibc.

Build

Each piece of Ryoan has a make file that will do the right thing after the project is configured. Make commands should be run in the following order; some pieces expect to be able to find headers generated by others.

  1. cd native_client && make J=${n_jobs} (make calls scons here so -j will do nothing).
  2. cd naclports && make -j${n_jobs}
  3. cd apps && make -j${n_jobs} do nothing).

Run

Running Ryoan requires a client program, a server program, and a pipeline specification that details what modules should be loaded. These can all be found in /apps. We've provided the scripts: run_*_benchmark.sh which demonstrate how pipelines should be run and interacted with.