Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.46k stars 4.17k forks source link

固定输入下每次推理的结果都不一样 #3363

Open dagongji10 opened 2 years ago

dagongji10 commented 2 years ago

具体问题

  1. tensorflow 模型使用 tf2onnx 工具转换为 db-sim.onnx.txt 文件,进行推理测试结果是正确的;
  2. 编译 ncnn,使用 onnx2ncnn 工具将 xxx.onnx 转换为 db-sim.param.txtxxx.bin;使用 ncnn2mem 去除可见字符,得到 db-sim.id.h.txt
  3. 在 Android Studio 中使用 ncnn 做推理,具体代码为(输入(640,640,3),输出(640,640,1)):
    
    #include <android/asset_manager_jni.h>
    #include <android/bitmap.h>
    #include <android/log.h>
    #include <jni.h>
    #include <string>

// ncnn

include "ncnn/arm64-v8a/include/ncnn/net.h"

include "ncnn/arm64-v8a/include/ncnn/benchmark.h"

include "db-sim.id.h"

static ncnn::Net db; static ncnn::UnlockedPoolAllocator g_blob_pool_allocator; static ncnn::PoolAllocator g_workspace_pool_allocator;

extern "C" JNIEXPORT jboolean JNICALL Java_com_example_db_1ncnn_1demo_DbNcnn_Init(JNIEnv *env, jobject thiz, jobject asset_manager) { // TODO: implement Init() ncnn::Option opt; opt.lightmode = true; opt.num_threads = 1; opt.blob_allocator = &g_blob_pool_allocator; opt.workspace_allocator = &g_workspace_pool_allocator;

AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);
db.opt = opt;

// init param
{
    int ret = db.load_param_bin(mgr, "db-sim.param.bin");
    if (ret != 0) {
        __android_log_print(ANDROID_LOG_DEBUG, "DB_NCNN_JNI", "load_param_bin failed");
        return JNI_FALSE;
    }
}

// init bin
{
    int ret = db.load_model(mgr, "db-sim.bin");
    if (ret != 0) {
        __android_log_print(ANDROID_LOG_DEBUG, "DB_NCNN_JNI", "load_model failed");
        return JNI_FALSE;
    }
}

return JNI_TRUE;

}

extern "C" JNIEXPORT jboolean JNICALL Java_com_example_db_1ncnn_1demo_DbNcnn_Detect(JNIEnv *env, jobject thiz, jobject bitmap, jboolean use_gpu) { // TODO: implement Detect() AndroidBitmapInfo info; AndroidBitmap_getInfo(env, bitmap, &info); int width = info.width; int height = info.height;

// ncnn from bitmap
ncnn::Mat in = ncnn::Mat::from_android_bitmap_resize(env, bitmap, ncnn::Mat::PIXEL_BGR, 640, 640);
const float mean_vals[3] = {103.939f, 116.779f, 123.6f};
const float norm_vals[3] = {1.0, 1.0, 1.0};;
in.substract_mean_normalize(mean_vals, norm_vals);

__android_log_print(ANDROID_LOG_DEBUG, "DB_NCNN_JNI", "in size: (%d, %d, %d), total: %d", in.c, in.h, in.w, in.total());
for (int c = 0; c < in.c; c++) {
    for (int i = 639; i < in.h; i++) {
        for (int j = 639; j < in.w; j++) {
            float t = in.channel(c).row(i)[j];
            __android_log_print(ANDROID_LOG_DEBUG, "DB_NCNN_JNI", "value: (%d, %d, %d): %f", c, i, j, t);
        }
    }
}

ncnn::Extractor ex = db.create_extractor();
ex.set_light_mode(true);
ex.input(db_sim_param_id::BLOB_dbnet_input, in);
ncnn::Mat out;
ex.extract(db_sim_param_id::BLOB_dbnet_output, out);

__android_log_print(ANDROID_LOG_DEBUG, "DB_NCNN_JNI", "out size: (%d, %d, %d)", out.c, out.h, out.w);
for (int c = 639; c < out.c; c++) {
    for (int i = 630; i < out.h; i++) {
        for (int j = 0; j < out.w; j++) {
            float t = out.channel(c).row(i)[j];
            __android_log_print(ANDROID_LOG_DEBUG, "DB_NCNN_JNI", "out size: (%d, %d, %d): %f", c, i, j, t);
        }
    }
}

db.clear();
return JNI_TRUE;

}


4. 程序能正确运行,但是每次推理将输出矩阵的一部分数值打印出来,发现每次都不相同:
![1](https://user-images.githubusercontent.com/15357846/142407830-6732d73a-73d2-4ee2-9830-45e1188c1770.png)
![2](https://user-images.githubusercontent.com/15357846/142407872-86a87ac6-247a-4259-9bb8-97666bbc6ed1.png)
![3](https://user-images.githubusercontent.com/15357846/142407884-993df278-0bc4-4dd0-b331-ed6317b58902.png)

5. 我将推理模型换成 [ncnn-android-mobilenetssd](https://github.com/nihui/ncnn-android-mobilenetssd) 中的模型,同样打印输出矩阵中的一部分数值,结果保持不变(是对的);onnx 模型是正确的,所以应该是 ncnn 模型的错误,能帮我看看是哪里出错了嘛?

谢谢!
wwdok commented 2 years ago

@dagongji10 Now I seem met the same problem with you : https://github.com/Tencent/ncnn/issues/3682. Do you have any progress afterwards ?

dagongji10 commented 2 years ago

@dagongji10 Now I seem met the same problem with you : #3682. Do you have any progress afterwards ?

I have solved this problem by changing the way extract data from output mat. Because I found I mistake the format for storing data of ncnn::mat.

for (int i = 0; i < out.h; i++) {
    for (int j = 0; j < out.w; j++) {
        float t = out.channel(0).row(i)[j];
    }
}
nihui commented 3 months ago

针对onnx模型转换的各种问题,推荐使用最新的pnnx工具转换到ncnn In view of various problems in onnx model conversion, it is recommended to use the latest pnnx tool to convert your model to ncnn

pip install pnnx
pnnx model.onnx inputshape=[1,3,224,224]

详细参考文档 Detailed reference documentation https://github.com/pnnx/pnnx https://github.com/Tencent/ncnn/wiki/use-ncnn-with-pytorch-or-onnx#how-to-use-pnnx