microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MIT License
948 stars 158 forks source link

[BUG] gpt2-model cuda codegen failed. #445

Open LeiWang1999 opened 2 years ago

LeiWang1999 commented 2 years ago

🐛 Bug gpt2-model cuda codegen failed with nnfusion cuda backend.

Got an issue with gpt2 series onnx models in the stage of onnx model import.

/workspace/v-leiwang3/nnfusion/build/src/tools/nnfusion/nnfusion /workspace/v-leiwang3/benchmark/nnfusion_models/gpt2-small.float32.onnx -f onnx -p batch_size:64;seq_length:512 -fwarmup_step=5 -frun_step=100 -fkernel_tuning_steps=1000 -fantares_mode=1 -fantares_codegen_server=127.0.0.1:8880 -fir_based_fusion=1 -fkernel_fusion_level=3 -fblockfusion_level=0 -ftuning_blocklist=Convolution -fhost_entry=0 -fdefault_device=CUDA -firfusion_blocklist=
[WARNING] 2022-07-05T16:41:39z src/contrib/custom_op/custom_op.h 27 $NNFUSION_HOME was not set, use /root/.nnfusion.
[WARNING] 2022-07-05T16:41:39z src/contrib/custom_op/custom_op.h 27 $NNFUSION_HOME was not set, use /root/.nnfusion.

============================================================================
---- Processing '/workspace/v-leiwang3/benchmark/nnfusion_models/gpt2-small.float32.onnx'
============================================================================
[INFO] 2022-07-05T16:41:39z src/nnfusion/frontend/onnx_import/onnx.cpp 54   Optimizing ONNX Graph with External Tool (models/pytorch2onnx/ort_run_frozen.py)
[WARNING] 2022-07-05T16:41:39z src/nnfusion/common/util.cpp 47  $NNFUSION_HOME was not set, use /root/.nnfusion.
ONNX model check passed!
Importing ONNX model into ONNX Runtime...
2022-07-05 16:41:44.004888731 [W:onnxruntime:, graph.cc:122 MergeShapeInfo] Error merging shape info for output. 'output' source:{64,512,100} target:{64,8,100}. Falling back to lenient merge.
Execution Providers: ['CUDAExecutionProvider', 'CPUExecutionProvider']
output
[-0.2370204   0.24946144 -0.94949317  0.4771166   1.1539005   0.30460253
 -1.1269218  -0.02994911  0.63004225 -0.01277094] ...(size= 3276800 end with -1.0244459 )
[INFO] 2022-07-05T16:41:47z src/nnfusion/frontend/onnx_import/onnx.cpp 40   Import ONNX Graph Size: [499687938]
[INFO] 2022-07-05T16:41:47z src/nnfusion/frontend/onnx_import/util/graph_convert.cpp 317    Converting Onnx Graph
[ERROR] 2022-07-05T16:41:47z src/nnfusion/util/errors.hpp 169   Failure at /workspace/v-leiwang3/nnfusion/src/nnfusion/frontend/onnx_import/util/util.hpp:57:
1166 unsupported data type: 9
terminate called after throwing an instance of 'nnfusion::errors::CheckError'
  what():  Failure at /workspace/v-leiwang3/nnfusion/src/nnfusion/frontend/onnx_import/util/util.hpp:57:
1166 unsupported data type: 9

onnx proto datatype map:

enum TensorProto_DataType {
  TensorProto_DataType_UNDEFINED = 0,
  TensorProto_DataType_FLOAT = 1,
  TensorProto_DataType_UINT8 = 2,
  TensorProto_DataType_INT8 = 3,
  TensorProto_DataType_UINT16 = 4,
  TensorProto_DataType_INT16 = 5,
  TensorProto_DataType_INT32 = 6,
  TensorProto_DataType_INT64 = 7,
  TensorProto_DataType_STRING = 8,
  TensorProto_DataType_BOOL = 9,
  TensorProto_DataType_FLOAT16 = 10,
  TensorProto_DataType_DOUBLE = 11,
  TensorProto_DataType_UINT32 = 12,
  TensorProto_DataType_UINT64 = 13,
  TensorProto_DataType_COMPLEX64 = 14,
  TensorProto_DataType_COMPLEX128 = 15,
  TensorProto_DataType_BFLOAT16 = 16
};

And I found the TensorProto_DataType_BOOL tensor's location, a where conditional operator:

image