microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MIT License
948 stars 158 forks source link

[Fix/Feat] Correct the fp16 inference of resnet50.onnx #433

Closed LeiWang1999 closed 2 years ago

LeiWang1999 commented 2 years ago

refer to this issue #431

  1. uncomment the dot fp16 cuda codegen code, and it worked.
  2. fix the incorrect data read progress of fp16 onnx model.

resnet50-fp16.onnx test passed, the output is: -3.080564e-01 7.984395e-02 -1.190038e+00 -1.483669e+00 -5.135902e-01 3.682717e-01 -2.163917e+00 -8.705018e-01 -1.881244e+00 -1.607677e-01 .. (size = 64000, ends with 2.435706e-01);

the output of onnxruntime is : [-0.3066 0.0791 -1.19 -1.487 -0.5127 0.371 -2.168 -0.874 -1.883 -0.1605] ...(size= 64000 end with 0.2446 )