container ENV variables passing and parsing

YangLiang3 commented 1 year ago

For complicated workloads, the tenant will set some specific ENV variables in config file. In order to run the workload successfully in Libos, The ENV variables should be passed. Shim and enclave-agent will work together to parse and combine the environment variables, and eventually pass them to the app enclave.

qzheng527 commented 1 year ago

From LibOS (Occlum, perhaps also same for Gramine) perpective, to run the workloads successfully, below information needs to be parsed first ( perhaps some component required here?) then pass to application agent (by LA channel?).

workload absolute entry point path. For example, it is /usr/bin/tensorflow_model_server for tensorflow serving.
environment values defined in config files, such as k8s yaml files.
environment values predefined in workload container Dockerfile.
extra link path, perhaps defined by LD_LIBRARA_PATH.

mkbhanda commented 1 year ago

Environment variables can be viewed and set by any root process (or with sudo access) on the bare metal or VM node in the Kubernetes cluster. It might even be the case that between container launch request and when the variable is actually read to create the application enclave it could have been changed or post launch. This is a security concern when we are trying to keep the CSP outside of the trusted compute base. Further, this sort of pattern defeats approaches where we keep all config information in a git repo so human errors by administrators/devops are eliminated. Using config maps at least ensures that it could be signed and possibly even encrypted by the end-user, and is a record of what was used.

See https://gramine.readthedocs.io/en/stable/manifest-syntax.html With Gramine, to use command line arguments, the application manifest provider needs to set

loader.insecure__use_cmdline_argv = true or loader.argv = ["arg0", "arg1", "arg2", ...] or loader.argv_src_file = "file:file_with_serialized_argv" // encrypted file is not yet supported

mkbhanda commented 1 year ago

Oct 20, 2022 enclave-cc meeting -- Occulum allows the passing in of environment variables, no restrictions on name or number and cannot guarantee that they are what they should be.

jiazhang0 commented 1 year ago

The traffic between kubectl and Pod is still untrusted because k8s infra still doesn't provide peer-to-peer trustworthy protection for CoCo.

In Kata-CC, the construction and consumption of ENV always happen in TEE Pod. This is what Enclave-CC currently can inherit from Kata-CC. For example, the construction of oci spec (config.json) for app container could be done by enclave-agent, and the ENV with FUSE key can be transferred via LA channel established between app enclave and agent enclave . The resulting config.json will be used by runc hosting app container but it is not trusted. LibOS running inside app enclave only accepts the ENV retrieved from the established LA channel.

mythi commented 1 year ago

LibOS running inside app enclave only accepts the ENV retrieved from the established LA channel.

is this referring to ENV variables defined in the image?

jiazhang0 commented 1 year ago

LibOS running inside app enclave only accepts the ENV retrieved from the established LA channel.

is this referring to ENV variables defined in the image?

At this point, the ENV variables defined by pod yaml and the ones defined in dockerfile/image are all combined together already.

mythi commented 1 year ago

This is a good issue to follow: https://github.com/confidential-containers/confidential-containers/issues/124

mkbhanda commented 1 year ago

In enclave-cc meeting Nov 3, 2022 the main use case mentioned was for some actions like enable/disable image signature check etc, for debug/prototyping efforts. Would it make sense to support a debug flag to kubectl to pass through environment variables?

jiazhang0 commented 1 year ago

This is a good issue to follow: confidential-containers/confidential-containers#124

This issue addresses the integrity of signature verification flag using measured and attested kernel command. However, it is not a general approach.

From a long-term, CoCo and K8s ecosystem need to provide a general solution for securing ENV variables to allow both enclave-cc and kata-cc to re-use ENV variables to control the configurable settings. From a short-term, the protection problem of ENV variables is just triggered by "how to securely dynamically configure the signature verification flag. So we have several options for enclave-cc specific:

make it available (and all other ENV variables) only under debug mode: https://github.com/confidential-containers/enclave-cc/issues/46#issuecomment-1301650649
handle ENV variables in agent-enclave: https://github.com/confidential-containers/enclave-cc/issues/46#issuecomment-1293058401
consider the approach of "Split host/tenant APIs" proposed by IBM

haosanzi commented 1 year ago

By discussing with hairong, Huaiqing,haokun, from the perspective of short-term goals, maybe we can use the following method to support complex applications,such as Redis.

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe, but CoCo and K8s ecosystem need to provide a general solution for securing ENV variables to allow both enclave-cc and kata-cc for a long-term)

The specific implementation is as follows

Add a SyncConfig ttrpc interface between shim and agent agent (step 6.1), the function is to send the oci spec of app enclave to agent enclave by shim.
1) The agent enclave merges the env obtained by image-rs with the received oci spec. 2) The agent enclave expands the entrypoint into an absolute path (occlum and gramine need to use absolute paths). 3) The agent enclave sends env, entrypoint, fuse key, upper dir, lower dir and other information to the occulm Libos Init or Gamine payload receiver through the LA channel.
Occlum Libos init or Gramine payload receiver mounts the fuse file system and runs the app container.

mythi commented 1 year ago

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe,

Is there a reason this needs to go through enclave agent? Could the shim just pass the final config.json prepared by containerd through a hostfs mount to the boot instance.

hairongchen commented 1 year ago

By discussing with hairong, Huaiqing,haokun, from the perspective of short-term goals, maybe we can use the following method to support complex applications,such as Redis.

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe, but CoCo and K8s ecosystem need to provide a general solution for securing ENV variables to allow both enclave-cc and kata-cc for a long-term)

The specific implementation is as follows

Add a SyncConfig ttrpc interface between shim and agent agent (step 6.1), the function is to send the oci spec of app enclave to agent enclave by shim.

The agent enclave merges the env obtained by image-rs with the received oci spec.

The agent enclave expands the entrypoint into an absolute path (occlum and gramine need to use absolute paths).

The agent enclave sends env, entrypoint, fuse key, upper dir, lower dir and other information to the occulm Libos Init or Gamine payload receiver through the LA channel.

efforts needed:

[ ] shim change
[ ] agent change
[ ] LA support
[ ] occlum boot instance change

hairongchen commented 1 year ago

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe,

Is there a reason this needs to go through enclave agent? Could the shim just pass the final config.json prepared by containerd through a hostfs mount to the boot instance.

one idea is if the secure channel is established in k8s control plane, then the information can directly flow into agent with this design, another idea is even the k8s control plane is not secured, we can always ask agent to check the integrity of the value against future implemented RVPS(https://github.com/confidential-containers/confidential-containers/issues/122) before sending to app enclave.

mythi commented 1 year ago

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe,

Is there a reason this needs to go through enclave agent? Could the shim just pass the final config.json prepared by containerd through a hostfs mount to the boot instance.

one idea is if the secure channel is established in k8s control plane, then the information can directly flow into agent with this design, another idea is even the k8s control plane is not secured, we can always ask agent to check the integrity of the value against future implemented RVPS(confidential-containers/confidential-containers#122) before sending to app enclave.

before we start any implementation work I think we should explore the option of not going via enclave-agent.

hairongchen commented 1 year ago

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe,

Is there a reason this needs to go through enclave agent? Could the shim just pass the final config.json prepared by containerd through a hostfs mount to the boot instance.

one idea is if the secure channel is established in k8s control plane, then the information can directly flow into agent with this design, another idea is even the k8s control plane is not secured, we can always ask agent to check the integrity of the value against future implemented RVPS(confidential-containers/confidential-containers#122) before sending to app enclave.

before we start any implementation work I think we should explore the option of not going via enclave-agent.

there's also a proposal discussed at https://github.com/confidential-containers/confidential-containers/issues/126 with a wider scope. After study the issue, I have following comments:

the proposal is promising to solve this end user data/config passing into enclave via untrusted components(e.g.: k8s/containerd/shim) issue
the end user data/config should be validated by agent before usage hence it should get into agent before further processing
current env passing design is compatible with the proposal's design and can be evolved to this proposal's future implementation

let me know your thoughts @haosanzi @mythi

haosanzi commented 1 year ago

User information (env, entrypont, cwd etc) configured by users in K8S YAML can be passed to enclave agent by shim. (This is not safe,

Is there a reason this needs to go through enclave agent? Could the shim just pass the final config.json prepared by containerd through a hostfs mount to the boot instance.

hi,mikko. The reason why we go through enclave agent as followed:

entrypoint: The user can set the entrypoint of the relative path in the yaml file, such as (helloworld), but occlum and gramine can only understand the absolute path(/bin/hello_world). So we need a module to convert relative paths into absolute paths. This module must be able to know the content of the application image. At present, only enclave-agent in enclave can do it. Shim in host doesn't have this ability.
Regarding the fuse key, the shim has no ability to obtain it.
About env, shim can obtain all the env information, which is much more than the env information obtained by enclave-agent by pulling the image. But shim only has the ability to put the env information in config.json, which is useless for applications started by occlum. The reason is the premise of occlum is that env needs to be pre-defined in occlum.json, otherwise these env cannot be passed into libos(for security). The shim does not control the occum.json of the application enclave. After discussed with the occlum team, enclave-agent passes the env to occlum init Libos through LA may be the solution.

Thank you very much!

mythi commented 1 year ago

hi,mikko. The reason why we go through enclave agent as followed:

"runtime-boot" (or init) has the access to the same info as enclave-agent (config.json essentially) so I was wondering if you found a reason why that would not work.

haosanzi commented 1 year ago

hi, mikko.

shim in enclave-cc generates the config.json of the app enclave. config.json contains all the env of the app enclave. Runc/runc can pass the env to the occlum pal layer, but the occlum pal runs on the host in the untrusted zone.

In occlum libos, it is actually passed in through the following interface. In libos, for safety, it will further check whether these environment variables are pre-defined in the Occlum.json file, if not, skip it.

In runtime boot, init needs to obtain information ( env, entrypoint, fuse key) before mounting the encrypted fuse file system.

How does occlum init get this env information? I can think of two options option 1. shim passed directly to init (such as: socket or files pass-through) option 2. shim passed to enclave-agent, then enclave-agent passed to init

We chose 2, because enclave-agent needs to send fuse key, entrypoint and other information to init through LA, and this channel can be reused for the transfer of envs.

If choose option 1,

socket way: Shim and init need establish a socket channel, which is more complicated to implement
files pass-through way: occlum init gets env by reading the file. This way can work, but occlum does not recommend this method.

Through the previous discussion, rootfs_key, rootfs_entry,..... are passed from enclave-agent to init through LA. (Let me know if we can agree on this?) To expand, the enclave-agent can pass all the parameters needed to start the app enclave to init based on LA. Including env. This implementation method is more flexible and matches the current method better.

Do you have any other good ways? thanks very much!

mythi commented 1 year ago

files pass-through way: occlum init gets env by reading the file. This way can work, but occlum does not recommend this method.

is "occlum init" same as our runtime-boot?

haosanzi commented 1 year ago

files pass-through way: occlum init gets env by reading the file. This way can work, but occlum does not recommend this method.

is "occlum init" same as our runtime-boot?

Yes. Sorry for my unclear statement, "occlum init" refers to the init process of occlum in the runtime-boot phase.

In runtime-boot, an init process is started first, and then the actual application process is started, such as the app enclave container in enclave-cc.

For Occlum's default init process, it just checks the integrity of the RootFS where the actual application is located, and then mounts RootFS and starts the actual application. For runtime-boot, the init process implementation needs to be modified. When init starts, use "rootfs_key" to decrypt RootFS and load it. (occlum recommends obtaining rootfs_key, env, entrypoint and other information through LA at this stage) Finally, the app enclave application starts.

Thank you!

qzheng527 commented 1 year ago

hi, mikko.

shim in enclave-cc generates the config.json of the app enclave. config.json contains all the env of the app enclave. Runc/runc can pass the env to the occlum pal layer, but the occlum pal runs on the host in the untrusted zone.

In occlum libos, it is actually passed in through the following interface. In libos, for safety, it will further check whether these environment variables are pre-defined in the Occlum.json file, if not, skip it.

In runtime boot, init needs to obtain information ( env, entrypoint, fuse key) before mounting the encrypted fuse file system.

How does occlum init get this env information? I can think of two options option 1. shim passed directly to init (such as: socket or files pass-through) option 2. shim passed to enclave-agent, then enclave-agent passed to init

We chose 2, because enclave-agent needs to send fuse key, entrypoint and other information to init through LA, and this channel can be reused for the transfer of envs.

If choose option 1,

socket way: Shim and init need establish a socket channel, which is more complicated to implement

files pass-through way: occlum init gets env by reading the file. This way can work, but occlum does not recommend this method. Through the previous discussion, rootfs_key, rootfs_entry,..... are passed from enclave-agent to init through LA. (Let me know if we can agree on this?) To expand, the enclave-agent can pass all the parameters needed to start the app enclave to init based on LA. Including env. This implementation method is more flexible and matches the current method better.

Do you have any other good ways? thanks very much!

The runtime env is already supported. The user can pass the struct user_rootfs_config to LibOS to mount encrypted application image.

Details please refer to https://github.com/qzheng527/occlum/tree/enclave-cc/demos/runtime_boot. The left part is how to get all the information (user_rootfs_config ) . (through LA?)

mythi commented 1 year ago

@haosanzi

(occlum recommends obtaining rootfs_key, env, entrypoint and other information through LA at this stage)

@qzheng527

The left part is how to get all the information (user_rootfs_config ) . (through LA?)

We don't have the LA functionality anytime soon and also that SyncConfig protocol is missing so I was wondering could we start something simple and make config.json available to init to parse and fill in user_rootfs_config)

qzheng527 commented 1 year ago

@haosanzi

(occlum recommends obtaining rootfs_key, env, entrypoint and other information through LA at this stage)

@qzheng527

The left part is how to get all the information (user_rootfs_config ) . (through LA?)

We don't have the LA functionality anytime soon and also that SyncConfig protocol is missing so I was wondering could we start something simple and make config.json available to init to parse and fill in user_rootfs_config)

I agree. With the envs information, we can try boot complicated container image, such as redis.

hairongchen commented 1 year ago

updates after community meeting:

after LA is ready, review this design
targeting V0.4

mythi commented 1 year ago

@qzheng527 a question about:

* **rootfs_entry**
The entry point of the rootfs. In his case, it is `"/bin"`.

how do I use this value? Can I set it to "/" to cover paths like /usr/bin too or what's the purpose of this setting?

confidential-containers / enclave-cc

container ENV variables passing and parsing #46