Open sealofyou opened 4 weeks ago
ValueError: processor in session meta is not valid: <ErSessionMeta(id=202410240807409572450_eval_0_0_guest_10000, name=, status=KILLED, tag=, processors=[***, len=4], options=[{'python.venv': '/data/projects/fate/common/python/venv', 'eggroll.rollpair.inmemory_output': 'True', 'eggroll.session.processors.per.node': '4', 'python.path': '/data/projects/fate/fate/python:/data/projects/fate/fate/python:/data/projects/fate/fateflow/python:/data/projects/fate/eggroll/python', 'eggroll.session.deploy.mode': 'cluster'}]) at 0x7f6d5159e790>
未知报错原因,这个报错可能出现在任何位置。请问我该从哪方面找问题
任何位置指的是toy和训练3方的每个节点(reader,nn,eval),不是一定出现,有时可以正常训练。 尝试过修改eggroll配置文件,提高内存,硬盘,cpu核数等操作。
报错:
ValueError: processor in session meta is not valid: <ErSessionMeta(id=202410240807409572450_eval_0_0_guest_10000, name=, status=KILLED, tag=, processors=[***, len=4], options=[{'python.venv': '/data/projects/fate/common/python/venv', 'eggroll.rollpair.inmemory_output': 'True', 'eggroll.session.processors.per.node': '4', 'python.path': '/data/projects/fate/fate/python:/data/projects/fate/fate/python:/data/projects/fate/fateflow/python:/data/projects/fate/eggroll/python', 'eggroll.session.deploy.mode': 'cluster'}]) at 0x7f6d5159e790>
FATE1.11.3的ansibleFATE双方安装。
未知报错原因,这个报错可能出现在任何位置。请问我该从哪方面找问题