Closed huoqifeng closed 5 days ago
@mkulke @stevenhorsman @liudalibj @bpradipt This is still in draft, I need some time to verify the code. Idea is to generate the agent-config.toml similar as cdh.toml. described in https://github.com/confidential-containers/cloud-api-adaptor/issues/1852
looks good in general. however, we were discussing in slack that we probably don't need to template the agent-config file anymore, after we have a config file for AA, since all the fields but
AAKBCParams
(which we wouldn't need anymore) are static and can be set at buildtime. what do you think?
@mkulke because AAKBCParams
is a dynamic endpoint, which might be changed per cluster. means we need a PeerPod VM image per cluster if we build it in PeerPod VM image. Shall we make AAKBCParams
configurable in configmap?
@mkulke because
AAKBCParams
is a dynamic endpoint, which might be changed per cluster. means we need a PeerPod VM image per cluster if we build it in PeerPod VM image. Shall we makeAAKBCParams
configurable in configmap?
AAKBCParams
is already a parameter in configmap (and a cli param for the CAA binary), so we don't need to do anything here.
My reasoning was:
@mkulke because
AAKBCParams
is a dynamic endpoint, which might be changed per cluster. means we need a PeerPod VM image per cluster if we build it in PeerPod VM image. Shall we makeAAKBCParams
configurable in configmap?
AAKBCParams
is already a parameter in configmap (and a cli param for the CAA binary), so we don't need to do anything here.My reasoning was:
* If we have a CDH and an AA config file, we don't need AAKBCParams the agent config anymore, because it's not used. * If we don't have an AAKbcParam in the agent config file it is completely static and the same in each deployment. * If the file is static, we don't need a cloud-config directive for it but can include it at build time.
@mkulke which is reasonable to me, I like this approach. How does the CDH and AA use AAKBCParams
which is set in k8s configmap? Is some changes happening in guest-components?
@mkulke which is reasonable to me, I like this approach. How does the CDH and AA use
AAKBCParams
which is set in k8s configmap? Is some changes happening in guest-components?
we already do that for cdh.toml
, for AA atm, we use agent-config.toml
, which is a workaround for when there was no AA config file yet. now AA supports a config file, and we also want to use it because the ${kbc}::${uri}
schema of AAKBCParams is not sufficient anymore, we also need to include a kbs certificate in the config.
@mkulke which is reasonable to me, I like this approach. How does the CDH and AA use
AAKBCParams
which is set in k8s configmap? Is some changes happening in guest-components?we already do that for
cdh.toml
, for AA atm, we useagent-config.toml
, which is a workaround for when there was no AA config file yet. now AA supports a config file, and we also want to use it because the${kbc}::${uri}
schema of AAKBCParams is not sufficient anymore, we also need to include a kbs certificate in the config.
OK, my understanding is we need generate an attestation-agent.toml
rather than a kata-agent.toml
(named agent-config.toml today) in cloud-api-adaptor side.
OK, my understanding is we need generate an
attestation-agent.toml
rather than akata-agent.toml
(named agent-config.toml today) in cloud-api-adaptor side.
yup, exactly
note: there is a nuisance when using a aa.toml file for CAA currently: an aa config file is only valid if it has a token_config
entry, which only makes sense for cc_kbc
, at the moment. for offline_fs_kbc
you cannot specifiy --config
, which means we need some machinery to do that:
/usr/bin/attestation-agent "${AA_CONFIG_FILE+--config $AA_CONFIG_FILE}"
@mkulke I found we still need set cc_kbc::http://IP:Port
in file /etc/agent-config.toml
because confidential-data-hub is still using it. If aa_kbc_params = "offline_fs_kbc::null"
set in file /etc/agent-config.toml
, cdh will report error like:
Jun 20 07:52:24 podvm-nginx-nydus-54c6f8c6cc-nm6td-e21b40ee confidential-data-hub[2825]: [2024-06-20T07:52:24Z ERROR ttrpc_cdh::ttrpc_server] [ttRPC CDH] GetResource :
Jun 20 07:52:24 podvm-nginx-nydus-54c6f8c6cc-nm6td-e21b40ee confidential-data-hub[2825]: get resource failed
Jun 20 07:52:24 podvm-nginx-nydus-54c6f8c6cc-nm6td-e21b40ee confidential-data-hub[2825]: Caused by: Kbs client error: offline-fs-kbc: resource not found one/two/key
While it's OK if files /etc/agent-config
does not exist at all.
Anything we have not completed yet in confidential-data-hub
configuration?
etc/agent-config
Sounds like CDH lost a function - Never use /etc/agent-config
if a config file provided by -c
like /usr/local/bin/confidential-data-hub -c /run/confidential-containers/cdh.toml
@mkulke I found we still need set
cc_kbc::http://IP:Port
in file/etc/agent-config.toml
because confidential-data-hub is still using it. Ifaa_kbc_params = "offline_fs_kbc::null"
set in file/etc/agent-config.toml
, cdh will report error like:Jun 20 07:52:24 podvm-nginx-nydus-54c6f8c6cc-nm6td-e21b40ee confidential-data-hub[2825]: [2024-06-20T07:52:24Z ERROR ttrpc_cdh::ttrpc_server] [ttRPC CDH] GetResource : Jun 20 07:52:24 podvm-nginx-nydus-54c6f8c6cc-nm6td-e21b40ee confidential-data-hub[2825]: get resource failed Jun 20 07:52:24 podvm-nginx-nydus-54c6f8c6cc-nm6td-e21b40ee confidential-data-hub[2825]: Caused by: Kbs client error: offline-fs-kbc: resource not found one/two/key
While it's OK if files
/etc/agent-config
does not exist at all. Anything we have not completed yet inconfidential-data-hub
configuration?
does CDH use the config file? In CAA we've been using the config file for CDH for a while now and it shouldn't rely on the agent-config.toml any more. This is the respective logic in CDH.
Sounds like CDH lost a function - Never use
/etc/agent-config
if a config file provided by-c
like/usr/local/bin/confidential-data-hub -c /run/confidential-containers/cdh.toml
where do you see that CDH is falling back to kata-agent-config?
Sounds like CDH lost a function - Never use
/etc/agent-config
if a config file provided by-c
like/usr/local/bin/confidential-data-hub -c /run/confidential-containers/cdh.toml
where do you see that CDH is falling back to kata-agent-config?
I did not find the code yet, it's confusion because I saw CDH logs like this and the function does not work if /etc/agent-config.toml
exists.
[2024-06-20T09:08:13Z INFO ttrpc_cdh] Use configuration file /run/confidential-containers/cdh.toml
[2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] get aa_kbc_params from file
[2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] reading agent config from /etc/agent-config.toml
[2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] get aa_kbc_params from file
[2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] reading agent config from /etc/agent-config.toml
[2024-06-20T09:08:13Z INFO ttrpc_cdh] [ttRPC] Confidential Data Hub starts to listen to request: unix:///run/confidential-containers/cdh.sock
Sounds like CDH lost a function - Never use
/etc/agent-config
if a config file provided by-c
like/usr/local/bin/confidential-data-hub -c /run/confidential-containers/cdh.toml
where do you see that CDH is falling back to kata-agent-config?
I did not find the code yet, it's confusion because I saw CDH logs like this and the function does not work if
/etc/agent-config.toml
exists.[2024-06-20T09:08:13Z INFO ttrpc_cdh] Use configuration file /run/confidential-containers/cdh.toml [2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] get aa_kbc_params from file [2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] reading agent config from /etc/agent-config.toml [2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] get aa_kbc_params from file [2024-06-20T09:08:13Z DEBUG attestation_agent::config::aa_kbc_params] reading agent config from /etc/agent-config.toml [2024-06-20T09:08:13Z INFO ttrpc_cdh] [ttRPC] Confidential Data Hub starts to listen to request: unix:///run/confidential-containers/cdh.sock
Sounds like https://github.com/confidential-containers/guest-components/blob/main/confidential-data-hub/hub/src/bin/config/mod.rs#L130 https://github.com/confidential-containers/guest-components/blob/main/attestation-agent/attestation-agent/src/config/aa_kbc_params.rs#L57
Sounds like https://github.com/confidential-containers/guest-components/blob/main/confidential-data-hub/hub/src/bin/config/mod.rs#L130 https://github.com/confidential-containers/guest-components/blob/main/attestation-agent/attestation-agent/src/config/aa_kbc_params.rs#L57
Thanks, for searching, I didn't spot it myself. I suspect this function is dead code, but I'm not 100% sure.
this fn sets AA_KBC_PARAMS + KBS_CERT envs as a side-effect and will do nothing else:
pub fn set_configuration_envs(&self) {...}
I was suspecting that image-rs is using it for encrypted images, but it doesn't seem to be used in guest-components:
/dev/guest-components (main)$ git grep AA_KBC_PARAMS
attestation-agent/attestation-agent/src/config/aa_kbc_params.rs: if let Ok(params) = env::var("AA_KBC_PARAMS") {
confidential-data-hub/README.md:* **AA_KBC_PARAMS** environment variable
confidential-data-hub/hub/src/bin/config/mod.rs: "AA_KBC_PARAMS",
Sounds like https://github.com/confidential-containers/guest-components/blob/main/confidential-data-hub/hub/src/bin/config/mod.rs#L130 https://github.com/confidential-containers/guest-components/blob/main/attestation-agent/attestation-agent/src/config/aa_kbc_params.rs#L57
Thanks, for searching, I didn't spot it myself. I suspect this function is dead code, but I'm not 100% sure.
this fn sets AA_KBC_PARAMS + KBS_CERT envs as a side-effect and will do nothing else:
pub fn set_configuration_envs(&self) {...}
I was suspecting that image-rs is using it for encrypted images, but it doesn't seem to be used in guest-components:
/dev/guest-components (main)$ git grep AA_KBC_PARAMS attestation-agent/attestation-agent/src/config/aa_kbc_params.rs: if let Ok(params) = env::var("AA_KBC_PARAMS") { confidential-data-hub/README.md:* **AA_KBC_PARAMS** environment variable confidential-data-hub/hub/src/bin/config/mod.rs: "AA_KBC_PARAMS",
Note: AA_KBC_PARAMS
env does seem to be required in the constructor of CDH's kbs kms plugin.
Yeah, I think which comes from the CDH config if agent-config.toml does not provide it, https://github.com/confidential-containers/guest-components/blob/main/confidential-data-hub/hub/src/bin/config/mod.rs#L133-L136, that's why I dropped it in this commit https://github.com/confidential-containers/cloud-api-adaptor/pull/1868/commits/fc0a04ef34bf93cee1fcd06a5625cbe684174967. @mkulke
I tried it on s390x by building the image following as below:
cd src/cloud-api-adaptor
ARCHES=linux/s390x make image
cd podvm-mkosi
make fedora-binaries-builder
ATTESTER=se-attester make binaries
make image-debug
# kubectl exec -it nginx-nydus-54c6f8c6cc-xzxzg -- /bin/sh
#
#
#
# curl http://127.0.0.1:8006/cdh/resource/one/two/key
some random characters....
Sat Jun 15 03:06:01 UTC 2024
# exit
@mkulke @stevenhorsman I think this PR can be reviewed now as I have tried it. May you please have a look?
CDH's kbs kms plugin.
oh, I think this is a case not covered indeed -- CDH's kbs kms plugin. We need a fix in CDH side.
I created an issue in cdh here https://github.com/confidential-containers/guest-components/issues/593 to track this. Sounds it's not big problem so far if we don't use KBS/KMS separately.
I tried it on s390x by building the image following as below:
- Build caa image
cd src/cloud-api-adaptor ARCHES=linux/s390x make image
- Build PodVM image
cd podvm-mkosi make fedora-binaries-builder ATTESTER=se-attester make binaries make image-debug
We should run
export HOST_KEYS_DIR=<the_hkd_folder>
runmake image
or before run themake image-debug
we should update the Makefile, to call../hack/build-s390x-se-image.sh
for image-debug, so that we can get the SE-enabled podvm image.
process-user-data.service.d
My fault, update-agent-config
is not used any more because agent-config.toml
is static and aa.toml
, cdh.toml
can be handled by provision-files
. Dropped a fix.
this is what we provision when we specify
--aa-kbc-params offline_fs_kbc::null
cat /run/peerpod/aa.toml [token_configs] [token_configs.coco_as] url = '' [token_configs.kbs] url = 'null'
not sure if that makes sense, it probably doesn't, but it might also not hurt at the moment. I think what we could do is just provision that file if
aa-kbc-params
iscc_kbc::*
from the CAA side, because the config file doesn't cover anything else yet.
Indeed, aa.toml, not sure cdh.toml should follow same pattern?
cc_kbc::*
With the fix in https://github.com/confidential-containers/guest-components/pull/599, I think we can leave the code as is, I will ad a log for s.aaKBCParams and pick the guest-components new commit df60725afe0ba452a25a740cf460c2855442c49a
I tried to build podvm ubuntu amd64 image, from this PR.
quay.io/liudalibj/podvm-generic-ubuntu-amd64:61a7406743c9ad6915b7dfc1fd3217cb5e581235b8791d791aaf9f669255fd52
make podvm-builder
make podvm-binaries
make podvm-image
with the built out podvm-image I can create a peerpod succeed.
But when I try to use this podvm-image with the sample kbs, cc_kbc::http://ip:8080
, the attestation-agent.service
is failed to start in the created peerpod vm.
The related log from journalctl
...
Jun 25 08:40:28 ubuntu systemd[1]: Installed transient /etc/machine-id file.
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Found ordering cycle on attestation-agent.service/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Found dependency on cloud-final.service/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Found dependency on multi-user.target/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Job attestation-agent.service/start deleted to break ordering cycle starting with multi-user.target/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Found ordering cycle on kata-agent.service/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Found dependency on cloud-final.service/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Found dependency on multi-user.target/start
Jun 25 08:40:28 ubuntu systemd[1]: multi-user.target: Job kata-agent.service/start deleted to break ordering cycle starting with multi-user.target/start
Jun 25 08:40:28 ubuntu systemd[1]: Created slice system-modprobe.slice.
...
Jun 25 08:40:33 podvm-nginx-nydus-54c6f8c6cc-dz74n-6f4c65e4 systemd-timesyncd[561]: Network configuration changed, trying to establish connection.
Jun 25 08:40:33 podvm-nginx-nydus-54c6f8c6cc-dz74n-6f4c65e4 systemd[1]: multi-user.target: Found ordering cycle on kata-agent.service/start
Jun 25 08:40:33 podvm-nginx-nydus-54c6f8c6cc-dz74n-6f4c65e4 systemd[1]: multi-user.target: Found dependency on cloud-final.service/start
Jun 25 08:40:33 podvm-nginx-nydus-54c6f8c6cc-dz74n-6f4c65e4 systemd[1]: multi-user.target: Found dependency on multi-user.target/start
Jun 25 08:40:33 podvm-nginx-nydus-54c6f8c6cc-dz74n-6f4c65e4 systemd[1]: multi-user.target: Job kata-agent.service/start deleted to break ordering cycle starting with multi-user.target/start
Jun 25 08:40:33 podvm-nginx-nydus-54c6f8c6cc-dz74n-6f4c65e4 systemd[1]: agent-protocol-forwarder.service: Scheduled restart job, restart counter is at 1.
...
@huoqifeng it seems that there are cycle decency on multi-user.target
, attestation-agent.service
and kata-agent.service
maybe we need do a fix, the e2e-test case TestLibvirtKbsKeyRelease
is failed because this.
oh, I only tested mkosi on x86_64 (i.e. without cloud-config)
So assuming you tested mkosi + cloud-config, and that worked, we probably should look into the delta of packer vs mkosi + cloud-config
@huoqifeng it seems that there are cycle decency on
multi-user.target
,attestation-agent.service
andkata-agent.service
maybe we need do a fix, the e2e-test caseTestLibvirtKbsKeyRelease
is failed because this.
Verified: replace the cloud-final.service
to cloud-init.service
, can fix the cycle dependency issue @huoqifeng @mkulke FYI.
With this replace fix, packer ubuntu amd64 peerpod + cloud-init + libvirt + sample kbc, works fine.
@mkulke not sure whether worth for you to have another try with the latest fix (hopefully last :-) ) here https://github.com/confidential-containers/cloud-api-adaptor/pull/1868/commits/1fbc63646559834565be06db6c098f7270dcdf73.
Fixes: #1852
agent
toaa
to reflect it's for attestation agent configcerts field will be added in aa config in https://github.com/confidential-containers/cloud-api-adaptor/pull/1875 after we tried it out
Signed-off-by: Qi Feng Huo huoqif@cn.ibm.com