PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.13k stars 5.55k forks source link

Fleet transpiler模式分布式能够训练,保存模型的时候出错,求解释这个是什么错 #19522

Closed githubutilities closed 4 years ago

githubutilities commented 5 years ago
An exception was thrown!
 Invoke operator save error.
Python Call stacks: 
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1780, in append_op
    attrs=kwargs.get("attrs", None))
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 225, in save_vars
    'file_path': os.path.join(save_dirname, new_var.name)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "cluster_train.py", line 561, in main
    fleet.save_persistables(exe, './save_params')
  File "cluster_train.py", line 638, in <module>
    main(args)
C++ Call stacks: 
holder_ should not be null
Tensor not initialized yet when Tensor::type() is called. at [/paddle/paddle/fluid/framework/tensor.h:139]
PaddlePaddle Call Stacks: 
0       0x7fe8c5420985p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7fe8c5420ce2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7fe8c54215f5p paddle::framework::Tensor::type() const + 101
3       0x7fe8c67447edp paddle::framework::GetDataTypeOfVar(paddle::framework::Variable const*) + 141
4       0x7fe8c58d2d0dp paddle::operators::SaveOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const + 61
5       0x7fe8c67472d7p paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 231
6       0x7fe8c67489c5p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 1189
7       0x7fe8c674904bp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost[INFO] 2019-08-29 02:27:12,061 [cluster_train.py:  565]:    Traceback (most recent call last):
  File "cluster_train.py", line 561, in main
    fleet.save_persistables(exe, './save_params')
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 241, in save_vars
    executor.run(save_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 615, in run
    six.reraise(*sys.exc_info())
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 611, in run
    use_program_cache=use_program_cache)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 653, in _run_impl
    use_program_cache=use_program_cache)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 745, in _run_program
    exe.run(program.desc, scope, 0, True, True, fetch_var_name)
EnforceNotMet: Invoke operator save error.
Python Call stacks: 
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1780, in append_op
    attrs=kwargs.get("attrs", None))
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 225, in save_vars
    'file_path': os.path.join(save_dirname, new_var.name)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "cluster_train.py", line 561, in main
    fleet.save_persistables(exe, './save_params')
  File "cluster_train.py", line 638, in <module>
    main(args)
C++ Call stacks: 
holder_ should not be null
Tensor not initialized yet when Tensor::type() is called. at [/paddle/paddle/fluid/framework/tensor.h:139]
PaddlePaddle Call Stacks: 
0       0x7fe8c5420985p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7fe8c5420ce2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7fe8c54215f5p paddle::framework::Tensor::type() const + 101
3       0x7fe8c67447edp paddle::framework::GetDataTypeOfVar(paddle::framework::Variable const*) + 141
4       0x7fe8c58d2d0dp paddle::operators::SaveOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const + 61
5       0x7fe8c67472d7p paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 231
6       0x7fe8c67489c5p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 1189
7       0x7fe8c674904bp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
8       0x7fe8c6743e06p paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 246
9       0x7fe8c558bb5ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 350
10      0x7fe8c558ea34p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
11      0x7fe8c5413cc3p
12      0x7fe8c544da9cp
13      0x7fe96e2b3ddcp PyEval_EvalFrameEx + 19596
14      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
15      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
16      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
17      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
18      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
19      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
20      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
21      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
22      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
23      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
24      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
25      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
26      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
27      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
28      0x7fe96e2b397ep PyEval_EvalFrameEx + 18478
29      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
30      0x7fe96e2b5352p PyEval_EvalCode + 50
31      0x7fe96e2dff22p PyRun_FileExFlags + 146
32      0x7fe96e2e1459p PyRun_SimpleFileExFlags + 217
33      0x7fe96e2f6e9dp Py_Main + 3149
34      0x7fe96d4f8bd5p __libc_start_main + 245
35            0x4007a1p

[INFO] 2019-08-29 02:27:12,062 [cluster_train.py:  566]:    Invoke operator save error.
Python Call stacks: 
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1780, in append_op
    attrs=kwargs.get("attrs", None))
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 225, in save_vars
    'file_path': os.path.join(save_dirname, new_var.name)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "cluster_train.py", line 561, in main
    fleet.save_persistables(exe, './save_params')
  File "cluster_train.py", line 638, in <module>
    main(args)
C++ Call stacks: 
holder_ should not be null
Tensor not initialized yet when Tensor::type() is called. at [/paddle/paddle/fluid/framework/tensor.h:139]
PaddlePaddle Call Stacks: 
0       0x7fe8c5420985p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7fe8c5420ce2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7fe8c54215f5p paddle::framework::Tensor::type() const + 101
3       0x7fe8c67447edp paddle::framework::GetDataTypeOfVar(paddle::framework::Variable const*) + 141
4       0x7fe8c58d2d0dp paddle::operators::SaveOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const + 61
5       0x7fe8c67472d7p paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 231
6       0x7fe8c67489c5p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 1189
7       0x7fe8c674904bp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
8       0x7fe8c6743e06p paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 246
9       0x7fe8c558bb5ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 350
10      0x7fe8c558ea34p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
11      0x7fe8c5413cc3p
12      0x7fe8c544da9cp
13      0x7fe96e2b3ddcp PyEval_EvalFrameEx + 19596
14      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
15      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
16      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
17      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
18      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
19      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
20      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
21      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
22      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
23      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
24      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
25      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
26      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
27      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
28      0x7fe96e2b397ep PyEval_EvalFrameEx + 18478
29      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
30      0x7fe96e2b5352p PyEval_EvalCode + 50
31      0x7fe96e2dff22p PyRun_FileExFlags + 146
32      0x7fe96e2e1459p PyRun_SimpleFileExFlags + 217
33      0x7fe96e2f6e9dp Py_Main + 3149
34      0x7fe96d4f8bd5p __libc_start_main + 245
35            0x4007a1p

[INFO] 2019-08-29 02:27:12,075 [cluster_train.py:   72]:    remote_path /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8/
19/08/29 02:27:13 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:14 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:15 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:15,424 [cluster_train.py:   81]:    Trainer 0 is done.
19/08/29 02:27:16 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:16,496 [cluster_train.py:   81]:    Trainer 1 is done.
::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
8       0x7fe8c6743e06p paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 246
9       0x7fe8c558bb5ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 350
10      0x7fe8c558ea34p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
11      0x7fe8c5413cc3p
12      0x7fe8c544da9cp
13      0x7fe96e2b3ddcp PyEval_EvalFrameEx + 19596
14      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
15      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
16      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
17      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
18      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
19      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
20      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
21      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
22      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
23      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
24      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
25      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
26      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
27      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
28      0x7fe96e2b397ep PyEval_EvalFrameEx + 18478
29      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
30      0x7fe96e2b5352p PyEval_EvalCode + 50
31      0x7fe96e2dff22p PyRun_FileExFlags + 146
32      0x7fe96e2e1459p PyRun_SimpleFileExFlags + 217
33      0x7fe96e2f6e9dp Py_Main + 3149
34      0x7fe96d4f8bd5p __libc_start_main + 245
35            0x4007a1p

path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8/
upload_to_hdfs cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ -put DONE_0 /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8/
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_0
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_1
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default19/08/29 02:27:17 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:17,566 [cluster_train.py:   81]:    Trainer 2 is done.
19/08/29 02:27:18 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:18,641 [cluster_train.py:   81]:    Trainer 3 is done.
19/08/29 02:27:19 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:23 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:27 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:31 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:35 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:39 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:43 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
19/08/29 02:27:47 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:48,021 [cluster_train.py:   81]:    Trainer 4 is done.
19/08/29 02:27:49 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:49,089 [cluster_train.py:   81]:    Trainer 5 is done.
19/08/29 02:27:50 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:50,164 [cluster_train.py:   81]:    Trainer 6 is done.
19/08/29 02:27:51 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:51,225 [cluster_train.py:   81]:    Trainer 7 is done.
.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_2
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_3
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_4
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_5
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_6
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_7
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-cli19/08/29 02:27:52 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:52,303 [cluster_train.py:   81]:    Trainer 8 is done.
19/08/29 02:27:53 INFO common.UpdateService: ZkstatusUpdater to yq01-global-hdfs.dmop.baidu.com:54310 started
[INFO] 2019-08-29 02:27:53,378 [cluster_train.py:   81]:    Trainer 9 is done.
[INFO] 2019-08-29 02:27:53,378 [cluster_train.py:   85]:    FINISH ALL
ent/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_8
path_is_exists cmd = /home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/hadoop-client/hadoop//bin/hadoop fs -Dfs.default.name=hdfs://yq01-global-hdfs.dmop.baidu.com:54310 -Dhadoop.job.ugi=ccdb,S2QJPZ  -test -e /user/ccdb/working/nlp/ol/fengshikun01/gnn/feed_ad/model_cpu_5/job-0bb5d6699c725cc8//DONE_9
An exception was thrown!
 Invoke operator save error.
Python Call stacks: 
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1780, in append_op
    attrs=kwargs.get("attrs", None))
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 225, in save_vars
    'file_path': os.path.join(save_dirname, new_var.name)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "cluster_train.py", line 571, in main
    fleet.save_persistables(exe, './save_params')
  File "cluster_train.py", line 638, in <module>
    main(args)
C++ Call stacks: 
holder_ should not be null
Tensor not initialized yet when Tensor::type() is called. at [/paddle/paddle/fluid/framework/tensor.h:139]
PaddlePaddle Call Stacks: 
0       0x7fe8c5420985p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7fe8c5420ce2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7fe8c54215f5p paddle::framework::Tensor::type() const + 101
3       0x7fe8c67447edp paddle::framework::GetDataTypeOfVar(paddle::framework::Variable const*) + 141
4       0x7fe8c58d2d0dp paddle::operators::SaveOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const + 61
5       0x7fe8c67472d7p paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 231
6       0x7fe8c67489c5p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 1189
7       0x7fe8c674904bp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
8       0x7fe8c6743e06p paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 246
9       0x7fe8c558bb5ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 350
10      0x7fe8c558ea34p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
11      0x7fe8c5413cc3p
12      0x7fe8c544da9cp
13      0x7fe96e2b3ddcp PyEval_EvalFrameEx + 19596
14      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
15      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
16      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
17      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
18      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
19      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
20      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
21      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
22      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
23      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
24      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
25      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
26      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
27      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
28      0x7fe96e2b397ep PyEval_EvalFrameEx + 18478
29      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
30      0x7fe96e2b5352p PyEval_EvalCode + 50
31      0x7fe96e2dff22p PyRun_FileExFlags + 146
32      0x7fe96e2e1459p PyRun_SimpleFileExFlags + 217
33      0x7fe96e2f6e9dp Py_Main + 3149
34      0x7fe96d4f8bd5p __libc_start_main + 245
35            0x4007a1p

Traceback (most recent call last):
  File "cluster_train.py", line 638, in <module>
    main(args)
  File "cluster_train.py", line 571, in main
    fleet.save_persistables(exe, './save_params')
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 241, in save_vars
    executor.run(save_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 615, in run
    six.reraise(*sys.exc_info())
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 611, in run
    use_program_cache=use_program_cache)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 653, in _run_impl
    use_program_cache=use_program_cache)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/executor.py", line 745, in _run_program
    exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_noavx.EnforceNotMet: Invoke operator save error.
Python Call stacks: 
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1780, in append_op
    attrs=kwargs.get("attrs", None))
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 225, in save_vars
    'file_path': os.path.join(save_dirname, new_var.name)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 466, in _save_distributed_persistables
    executor, main_program=main_program, dirname=dirname, vars=local_vars)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/io.py", line 517, in save_persistables
    executor, dirname=dirname, main_program=main_program)
  File "/home/disk1/normandy/maybach/app-user-20190828231217-113/workspace/python27-gcc482/lib/python2.7/site-packages/paddle/fluid/incubate/fleet/parameter_server/distribute_transpiler/__init__.py", line 197, in save_persistables
    io.save_persistables(executor, dirname, main_program, None)
  File "cluster_train.py", line 571, in main
    fleet.save_persistables(exe, './save_params')
  File "cluster_train.py", line 638, in <module>
    main(args)
C++ Call stacks: 
holder_ should not be null
Tensor not initialized yet when Tensor::type() is called. at [/paddle/paddle/fluid/framework/tensor.h:139]
PaddlePaddle Call Stacks: 
0       0x7fe8c5420985p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7fe8c5420ce2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7fe8c54215f5p paddle::framework::Tensor::type() const + 101
3       0x7fe8c67447edp paddle::framework::GetDataTypeOfVar(paddle::framework::Variable const*) + 141
4       0x7fe8c58d2d0dp paddle::operators::SaveOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const + 61
5       0x7fe8c67472d7p paddle::framework::OperatorWithKernel::ChooseKernel(paddle::framework::RuntimeContext const&, paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 231
6       0x7fe8c67489c5p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 1189
7       0x7fe8c674904bp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
8       0x7fe8c6743e06p paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 246
9       0x7fe8c558bb5ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 350
10      0x7fe8c558ea34p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
11      0x7fe8c5413cc3p
12      0x7fe8c544da9cp
13      0x7fe96e2b3ddcp PyEval_EvalFrameEx + 19596
14      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
15      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
16      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
17      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
18      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
19      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
20      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
21      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
22      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
23      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
24      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
25      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
26      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
27      0x7fe96e2b34f1p PyEval_EvalFrameEx + 17313
28      0x7fe96e2b397ep PyEval_EvalFrameEx + 18478
29      0x7fe96e2b521dp PyEval_EvalCodeEx + 2061
30      0x7fe96e2b5352p PyEval_EvalCode + 50
31      0x7fe96e2dff22p PyRun_FileExFlags + 146
32      0x7fe96e2e1459p PyRun_SimpleFileExFlags + 217
33      0x7fe96e2f6e9dp Py_Main + 3149
34      0x7fe96d4f8bd5p __libc_start_main + 245
35            0x4007a1p
frankwhzhang commented 5 years ago

代码和whl包版本贴一下吧 是自己写的代码 还是官方提供的

paddle-bot-old[bot] commented 4 years ago

Since you haven\'t replied for more than a year, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. 由于您超过一年未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。