alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.72k stars 1.67k forks source link

I have a model that runs on the CPU and gets the output of NaN, but it can get the normal output on opencl. #998

Closed TCBocean closed 3 years ago

TCBocean commented 4 years ago

The converter and inferencer I compiled on ubuntu, the compilation method is the same as the example on yuque. Both versions 1.0.1 and 1.0.0 of github have been tried, and they both encountered the same problem on this model.

The model is converted from tflite (in addition, I use the pb model to convert mnn, after copying the output, reading the output value will crash) I share the model on BaiDuYun. 链接:https://pan.baidu.com/s/1n2z6fEZnxZS7AbAfYI4xIQ 提取码:wrkw

The model input is 256x256x3, there are three outputs, and the output size is 16x16x8x4.

The code entered is roughly as follows:
auto inputtensor = new MNN::Tensor(input, MNN::Tensor::TENSORFLOW);
float * input_ptr = inputtensor->host<float>();
a_component->get_low_im(input_ptr);
input->copyFromHostTensor(inputtensor);

The output code is roughly as follows:

auto output1 = interpreter->getSessionOutput(session, "Identity");
auto output2 = interpreter->getSessionOutput(session, "Identity_1");
auto output3 = interpreter->getSessionOutput(session, "Identity_2");
auto copy_output1 = new MNN::Tensor(output1, MNN::Tensor::TENSORFLOW);
output1->copyToHostTensor(copy_output1);
auto copy_output2 = new MNN::Tensor(output2, MNN::Tensor::TENSORFLOW);
output2->copyToHostTensor(copy_output2);
auto copy_output3 = new MNN::Tensor(output3, MNN::Tensor::TENSORFLOW);
output3->copyToHostTensor(copy_output3);
LOGE("output1: %f",copy_output1->host<float>()[0]);
LOGE("output2: %f",copy_output2->host<float>()[0]);
LOGE("output3: %f",copy_output3->host<float>()[0]);

The 3 LOGEs in the final output code run the printed NaN under the CPU, and the printing under opencl is similar to expectations. Can anyone help me?

TCBocean commented 4 years ago

链接:https://pan.baidu.com/s/1UbGEdxrxiVnDZAtB06RAIg 提取码:1zr1 I have another model that has a similar problem. I suspect that there is a problem in the final branch fusion of the model, but I don't know how to check the problem. Can anyone help me?

TCBocean commented 4 years ago

By the way, my Android inference library got the correct output of opencl after this modification: https://github.com/alibaba/MNN/issues/1003#issuecomment-663928745

123liluky commented 4 years ago

你转出来的的mnn模型是3个输出吗? 我现在有个问题,onnx模型是3个输出,转成mnn却只剩1个输出了。你知道怎么解决吗?

TCBocean commented 4 years ago

@123liluky 抱歉没遇到过

lucasjinreal commented 1 year ago

Same question here.