Open LeiWang1999 opened 2 years ago
Basically, the constants were generated from two sources.
I think you can debug from these two directions.
Thanks @mzmssg , I found the problem. https://github.com/microsoft/nnfusion/blob/c3cd7155303e3af2f56c9f9bf0778656c372bc14/src/nnfusion/frontend/onnx_import/util/util.cpp#L98
It considered fp16 data as fp32, and now it's fixed by this pr #443
resnet50-fp16.onnx is passed and the output of it is correct now, but from my point of view, there are a better improvement to define a series of NNFusion DataType, for example:
::nnfusion::datatype::FLOAT_64
::nnfusion::datatype::FLOAT_32
::nnfusion::datatype::FLOAT_16
Because currently they are double
, float
, half_float::half
, it's not programming friendly I think.
🐛 Incorrect output for resnet50-float16
I have already generated cuda code from a resnet50-float16.onnx description, but the execution of
main_test
gave an incorrect output compared with onnxruntime's output, according to my examination, this is caused by theConstantxx.bin
which the value of it stored is all zero.Pick a Constant_0_0.bin (the first param to be loaded in cuda code) as example.
For fp32 cuda code gen, the value of
Constant_0_0.bin
is :For fp16 cuda code gen, the value of
Constant_0_0.bin
is :I traced the program but it's hard for me to find the code which generated the value of constant.bin (actually I can find the code to generate the file, it's located in cuda code gen pass, but I didn't find the code which generate the data of the constant).
Any suggestions for this progress? How can I find the progress of constant data generation ?I guess the progress of data generation may be located in one of the graph passes?