FederatedAI / KubeFATE

Manage federated learning workload using cloud native technologies.
Apache License 2.0
420 stars 222 forks source link

kubefate中的fate-serving无法启动服务 #918

Closed ChrisHuo-04 closed 10 months ago

ChrisHuo-04 commented 10 months ago

环境: 1.Kubefate 1.7.2(FATE-serving 为配套2.0.4版本) 2.K8S版本: Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:58:47Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7", GitCommit:"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4", GitTreeState:"clean", BuildDate:"2021-11-17T14:35:38Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}

问题1 k8s环境部署后,在python的容器下,可在FATE中正常启动离线训练及预测,双方模型可正常deploy。执行load模型功能,报RPC错如下: 1698636116526

问题2 尝试通过拷贝模型压缩文件的方式替代load操作。将FATE环境中的双方模型,在python的容器中export的模型.zip文件,分别拷贝到各方serving-server容器的.fate目录下。尝试通过执行 sh service.sh restart启动服务,屏幕输出如下: 1698645769521

问题3 无权限编辑修改serving-server里的conf/serving-server.properties。chmod无法修改权限。 1698645900071

辛苦协助排查,多谢