secretflow / kuscia

Kuscia(Kubernetes-based Secure Collaborative InfrA) is a K8s-based privacy-preserving computing task orchestration framework.
https://www.secretflow.org.cn/docs/kuscia/latest/zh-Hans
Apache License 2.0
72 stars 49 forks source link

docker kuscia安装问题 #392

Open wangzeyu135798 opened 1 month ago

wangzeyu135798 commented 1 month ago

Issue Type

Api Usage

Search for existing issues similar to yours

Yes

Kuscia Version

kuscia 0.10

Link to Relevant Documentation

No response

Question Details

[root@cm-dssn-node1 secu]# docker run -it --rm ${KUSCIA_IMAGE} kuscia init --mode autonomy --domain "JD230607033" > autonomy_JD230607033.yaml 2>&1 || autonomy_JD230607033.yaml
bash: autonomy_JD230607033.yaml: command not found
[root@cm-dssn-node1 secu]# [root@cm-dssn-node1 secu]# cat autonomy_JD230607033.yaml
bash: [root@cm-dssn-node1: command not found
[root@cm-dssn-node1 secu]# Invalid config, err: invalid domain: must conform to a regular expression `^[a-z0-9]([a-z0-9.-]{0,61}[a-z0-9])?$`
我想安装domainID为JD230607033的docker kuscia,但是会出现上面的报错问题,请问是为什么,该如何解决?
BrainWH commented 1 month ago

您好,看到了您的报错信息,可以确定是正则化的问题。具体的解决方法如下: docker run -it --rm ${KUSCIA_IMAGE} kuscia init --mode autonomy --domain "jd230607033" > autonomy_JD230607033.yaml

wangzeyu135798 commented 1 month ago

为什么会出现正则化的问题呢?

BrainWH commented 1 month ago

您好,是这样的,出现正则化的问题是因为需要对输入的数据进行验证,确保其符合特定的格式或规则。目前domainID的正则表达式是不匹配大写字母的。可以参照报错信息中的正则表达式: Invalid config, err: invalid domain: must conform to a regular expression ^[a-z0-9]([a-z0-9.-]{0,61}[a-z0-9])?$

wangzeyu135798 commented 1 month ago

[root@cm-dssn-node1 secu]# ./kuscia.sh start -c autonomy_alice.yaml -p 11080 -k 11081 KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.10.0b0 SECRETFLOW_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.8.0b0 grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory ROOT=/home/secu DOMAIN_ID= DOMAIN_HOST_PORT=11080 DOMAIN_HOST_INTERNAL_PORT=13081 DOMAIN_DATA_DIR=/home/secu/secu-kuscia--/data DOMAIN_LOG_DIR=/home/secu/secu-kuscia--/logs KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.10.0b0 KUSCIAAPI_HTTP_PORT=11081 KUSCIAAPI_GRPC_PORT=13083 Starting container secu-kuscia-- ... secu-kuscia---containerd domain_hostname=secu-kuscia---cm-dssn-node1 network=kuscia-exchange invalid argument "-p" for "-m, --memory" flag: invalid size: '-p' See 'docker run --help'. 按照教程-p为什么会报错?

wangzul commented 1 month ago

[root@cm-dssn-node1 secu]# ./kuscia.sh start -c autonomy_alice.yaml -p 11080 -k 11081 KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.10.0b0 SECRETFLOW_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.8.0b0 grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory grep: /home/secu/autonomy_alice.yaml: No such file or directory ROOT=/home/secu DOMAIN_ID= DOMAIN_HOST_PORT=11080 DOMAIN_HOST_INTERNAL_PORT=13081 DOMAIN_DATA_DIR=/home/secu/secu-kuscia--/data DOMAIN_LOG_DIR=/home/secu/secu-kuscia--/logs KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.10.0b0 KUSCIAAPI_HTTP_PORT=11081 KUSCIAAPI_GRPC_PORT=13083 Starting container secu-kuscia-- ... secu-kuscia---containerd domain_hostname=secu-kuscia---cm-dssn-node1 network=kuscia-exchange invalid argument "-p" for "-m, --memory" flag: invalid size: '-p' See 'docker run --help'. 按照教程-p为什么会报错?

1.根据你的日志分析发现你缺少autonomy_alice.yaml文件,你可以通过下面2条命令生产autonomy_alice.yaml export KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.10.0b0 docker run -it --rm ${KUSCIA_IMAGE} kuscia init --mode autonomy --domain "你的节点名称" > autonomy_alice.yaml 2>&1 || autonomy_alice.yaml

wangzeyu135798 commented 1 month ago

您好,看到了您的报错信息,可以确定是正则化的问题。具体的解决方法如下: docker run -it --rm ${KUSCIA_IMAGE} kuscia init --mode autonomy --domain "jd230607033" > autonomy_JD230607033.yaml 执行./kuscia.sh start -c autonomy_JD230607033 -p 11080 -k 11081,会有报错: KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:latest SECRETFLOW_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.7.0b0 grep: /home/secu/autonomy_JD230607033: No such file or directory grep: /home/secu/autonomy_JD230607033: No such file or directory grep: /home/secu/autonomy_JD230607033: No such file or directory grep: /home/secu/autonomy_JD230607033: No such file or directory grep: /home/secu/autonomy_JD230607033: No such file or directory k3s data already exists /root/kuscia/secu-kuscia--/k3s... Whether to retain k3s data?(y/n): y ROOT=/home/secu DOMAIN_ID= DOMAIN_HOST_PORT=11080 DOMAIN_HOST_INTERNAL_PORT=13081 DOMAIN_DATA_DIR=/home/secu/secu-kuscia--/data DOMAIN_LOG_DIR=/home/secu/secu-kuscia--/logs KUSCIA_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:latest KUSCIAAPI_HTTP_PORT=11081 KUSCIAAPI_GRPC_PORT=13083 Starting container secu-kuscia-- ... secu-kuscia---containerd domain_hostname=secu-kuscia---cm-dssn-node1 network=kuscia-exchange invalid argument "-p" for "-m, --memory" flag: invalid size: '-p' See 'docker run --help'.

为什么不生成存在/home/secu/autonomy_JD230607033这个文件夹呢?是因为domainId不符合正则吗?目前只可以生成autonomy_JD230607033.yaml这个文件。

BrainWH commented 1 month ago

可以先看一下对应的文档https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/reference/apis/domain_cn#create-domain

wangzeyu135798 commented 1 month ago

创建两个节点授权的时候,结果总是显示unchanged,请问这是为什么?后面的通信状态为false是不是和这个有关? [root@dssn-master-04 secu]# docker exec -it ${USER}-kuscia-autonomy-jd230620034 scripts/deploy/join_to_host.sh jd230620034 jd230607033 https://172.32.173.1:14080 clusterdomainroute.kuscia.secretflow/jd230620034-jd230607033 unchanged [root@dssn-master-04 secu]# docker exec -it ${USER}-kuscia-autonomy-jd230620034 kubectl get cdr jd230620034-jd230607033 NAME SOURCE DESTINATION HOST AUTHENTICATION READY jd230620034-jd230607033 jd230620034 jd230607033 172.32.173.1 Token False

wangzeyu135798 commented 1 month ago

我这是安装的第二遍,之前把相关的文件和容器全部删除之后再重新安装的,为啥clusterdomainroute.kuscia.secretflow/jd230620034-jd230607033 unchanged报的是unchanged,不应该是created么?

BrainWH commented 1 month ago

1、删除之后可能存在缓存,可以执行uninstall.sh跑一下,从这里获取卸载脚本[https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/getting_started/quickstart_cn#uninstall]或者可以换个domainid试一下 2、从你的日志中看到当前注册的路由是异常状态【READY = false】可以根据https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/Docker_deployment_kuscia/deploy_p2p_cn#id5排查你的地址是否正确 注意:docker exec -it ${USER}-kuscia-autonomy-jd230620034 scripts/deploy/join_to_host.sh jd230620034 jd230607033 {ip} 其中ip:port要使用 jd230607033方的 3、执行scripts/deploy/join_to_host.sh脚本时返回结果如下 created --代表首次执行 unchanged --代表路由已存在执行的参数和已注册的参数一致 jd230620034 jd230607033 {ip}
configured --代表路由修改成功

wangzeyu135798 commented 1 month ago

uninstall.sh 如何删除指定的domainId?

BrainWH commented 4 weeks ago

uninstall.sh会把所有的kuscia容器和挂载的信息删除