Open xysmlx opened 3 years ago
Thanks for the report @xysmlx! I will look into it ASAP! (I'm a bot).
Meet the same problem in bert training:
[INFO] 2021-03-16T04:29:51z src/nnfusion/engine/pass/graph/runtime_const_folding_pass.cpp 58 >> Found constant downstream node: 210, Op Type = GatherV2
[INFO] 2021-03-16T04:29:51z src/nnfusion/engine/pass/graph/runtime_const_folding_pass.cpp 71 Input of constant downstream node: 209, Op Type = Constant/Constant
[INFO] 2021-03-16T04:29:51z src/nnfusion/engine/pass/graph/runtime_const_folding_pass.cpp 83 With Constant Input Node: 209, Memory Length = 8
[INFO] 2021-03-16T04:29:51z src/nnfusion/engine/pass/graph/runtime_const_folding_pass.cpp 71 Input of constant downstream node: 208, Op Type = Constant/Constant
[INFO] 2021-03-16T04:29:51z src/nnfusion/engine/pass/graph/runtime_const_folding_pass.cpp 83 With Constant Input Node: 208, Memory Length = 16
[ERROR] 2021-03-16T04:29:51z src/nnfusion/util/errors.hpp 169 Check failed: 'inputs[i].size() == _size' at /home/lingm/projects0/nnfusion_mlx/src/nnfusion/engine/profiler/profiler.hpp:97:
(no explanation given)
terminate called after throwing an instance of 'nnfusion::errors::CheckError'
what(): Check failed: 'inputs[i].size() == _size' at /home/lingm/projects0/nnfusion_mlx/src/nnfusion/engine/profiler/profiler.hpp:97:
(no explanation given)
Aborted (core dumped)
To reproduce:
nnfusion bert_train_bs2.onnx -f onnx -fautodiff=true -ftraining_mode=true -ftraining_optimizer='{"optimizer": "SGD", "learning_rate": 0.0001}' -fblockfusion_level=0 -fkernel_fusion_level=0 -fconst_folding_backend=CUDA
bert_train_bs2.onnx is generated from src/python/example/bert.py
🐛 Bug
Enabling const folding by setting -fconst_folding_backend=CUDA for a gnn model leads to check failure in const folding pass
Here is the check failure code:
To Reproduce Steps to reproduce the behavior: