FederatedAI / FATE

An Industrial Grade Federated Learning Framework
Apache License 2.0
5.71k stars 1.55k forks source link

FATE单机部署-Docker镜像方式启动fate_flow_server异常 #4614

Closed SnakeTom closed 4 months ago

SnakeTom commented 1 year ago

bash bin/init.sh status;

[INFO] env dir: /data/projects/fate/env [INFO] jdk dir: /data/projects/fate/env/jdk/jdk-8u345 [INFO] venv dir: /data/projects/fate/env/python/venv [INFO] kernel: Linux [INFO] linux system: CentOS Linux [INFO] os: CentOS [INFO] is root user PROJECT_BASE: /data/projects/fate PYTHONPATH: /data/projects/fate/fate/python:/data/projects/fate/fateflow/python found service conf: /data/projects/fate/conf/service_conf.yaml fate flow http port: 9380, grpc port: 9360

check process by http port and grpc port service not running JAVA_HOME=/data/projects/fate/env/jdk/jdk-8u345

启动信息: check process by http port and grpc port check process by http port and grpc port check process by http port and grpc port check process by http port and grpc port check process by http port and grpc port check process by http port and grpc port check process by http port and grpc port ..... service start failed, please check /data/projects/fate/fateflow/logs/error.log and /data/projects/fate/fateflow/logs/console.log

错误日志:

[root@d954786e6a6d fate]# cat /data/projects/fate/fateflow/logs/error.log Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/fate_flow_server.py", line 88, in ComponentRegistry.load() File "/data/projects/fate/fateflow/python/fate_flow/db/component_registry.py", line 35, in load component_registry = cls.get_from_db(file_utils.load_json_conf_real_time(FATE_FLOW_DEFAULT_COMPONENT_REGISTRY_PATH)) File "/data/projects/fate/env/python/venv/lib/python3.8/site-packages/peewee.py", line 394, in inner return fn(*args, kwargs) File "/data/projects/fate/fateflow/python/fate_flow/db/component_registry.py", line 179, in get_from_db for component_alias in component_registry["components"][module.f_component_name]["alias"]: KeyError: 'dataio' Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/fate_flow_server.py", line 88, in ComponentRegistry.load() File "/data/projects/fate/fateflow/python/fate_flow/db/component_registry.py", line 35, in load component_registry = cls.get_from_db(file_utils.load_json_conf_real_time(FATE_FLOW_DEFAULT_COMPONENT_REGISTRY_PATH)) File "/data/projects/fate/env/python/venv/lib/python3.8/site-packages/peewee.py", line 394, in inner return fn(*args, *kwargs) File "/data/projects/fate/fateflow/python/fate_flow/db/component_registry.py", line 179, in get_from_db for component_alias in component_registry["components"][module.f_component_name]["alias"]: KeyError: 'dataio' Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/fate_flow_server.py", line 88, in ComponentRegistry.load() File "/data/projects/fate/fateflow/python/fate_flow/db/component_registry.py", line 35, in load component_registry = cls.get_from_db(file_utils.load_json_conf_real_time(FATE_FLOW_DEFAULT_COMPONENT_REGISTRY_PATH)) File "/data/projects/fate/env/python/venv/lib/python3.8/site-packages/peewee.py", line 394, in inner return fn(args, kwargs) File "/data/projects/fate/fateflow/python/fate_flow/db/component_registry.py", line 179, in get_from_db for component_alias in component_registry["components"][module.f_component_name]["alias"]: KeyError: 'dataio'

SnakeTom commented 1 year ago

卸载之前的“通过镜像包”方式安装,

重新使用“公共镜像服务”方式安装,可以正常启动。

问题终结~

mgqa34 commented 1 year ago

卸载之前的“通过镜像包”方式安装,

重新使用“公共镜像服务”方式安装,可以正常启动。

问题终结~

这里的意思是说新旧镜像冲突了吗?还是指的某种安装方式是有问题的呢?

SnakeTom commented 1 year ago

卸载之前的“通过镜像包”方式安装, 重新使用“公共镜像服务”方式安装,可以正常启动。 问题终结~

这里的意思是说新旧镜像冲突了吗?还是指的某种安装方式是有问题的呢?

个人感觉是安装方式的问题,我是新申请的服务器资源,按照文档里的“镜像包”方式(wget)安装启动就报上面的错误。 然后卸载了这个镜像,重新用的docker pull的方式安装,能正常启动

mgqa34 commented 1 year ago

卸载之前的“通过镜像包”方式安装, 重新使用“公共镜像服务”方式安装,可以正常启动。 问题终结~

这里的意思是说新旧镜像冲突了吗?还是指的某种安装方式是有问题的呢?

个人感觉是安装方式的问题,我是新申请的服务器资源,按照文档里的“镜像包”方式(wget)安装启动就报上面的错误。 然后卸载了这个镜像,重新用的docker pull的方式安装,能正常启动

是最新的1.10版本吗?我们也排查下

SnakeTom commented 1 year ago

卸载之前的“通过镜像包”方式安装, 重新使用“公共镜像服务”方式安装,可以正常启动。 问题终结~

这里的意思是说新旧镜像冲突了吗?还是指的某种安装方式是有问题的呢?

个人感觉是安装方式的问题,我是新申请的服务器资源,按照文档里的“镜像包”方式(wget)安装启动就报上面的错误。 然后卸载了这个镜像,重新用的docker pull的方式安装,能正常启动

是最新的1.10版本吗?我们也排查下

是。v1.10.0

github-actions[bot] commented 4 months ago

This issue has been marked as stale because it has been open for 365 days with no activity. If this issue is still relevant or if there is new information, please feel free to update or reopen it.

github-actions[bot] commented 4 months ago

This issue was closed because it has been inactive for 1 days since being marked as stale. If this issue is still relevant or if there is new information, please feel free to update or reopen it.