RangiLyu / nanodet

NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥
Apache License 2.0
5.7k stars 1.04k forks source link

ncnn推理时,出现Segmentation fault (core dumped)问题 #130

Open cqchenqianqc opened 3 years ago

cqchenqianqc commented 3 years ago

image 定位到以下发现出错 void NanoDet::decode_infer(ncnn::Mat& cls_pred, ncnn::Mat& dis_pred, int stride, float threshold, std::vector<std::vector>& results) { int feature_h = this->input_size / stride; int feature_w = this->input_size / stride; std::cout<< "feature_h:"<<feature_h<<std::endl; std::cout<< "cls_pred"<<cls_pred<<std::endl; std::cout<< "decode_infer"<<std::endl;

//std::cout<< "cls_pred"<<cls_pred.row()<<std::endl;
//cv::Mat debug_heatmap = cv::Mat(feature_h, feature_w, CV_8UC3);
for (int idx = 0; idx < feature_h * feature_w; idx++)
{
    //std::cout << "w * h: " << feature_h * feature_w << std::endl;
    const float* scores = cls_pred.row(idx);
    int row = idx / feature_w;
    int col = idx % feature_w;
    float score = 0;
    int cur_label = 0;
    for (int label = 0; label < this->num_class; label++)
    {
        if (scores[label] > score)
        {
            score = scores[label];
            cur_label = label;
        }
    }
cqchenqianqc commented 3 years ago

please,help me !! give me some suggestion

RangiLyu commented 3 years ago

你在nanodet.cpp里的类别数量和输入大小是否和训练时设置的保持一致?

cqchenqianqc commented 3 years ago

你在nanodet.cpp里的类别数量和输入大小是否和训练时设置的保持一致?

我在nanodet.cpp里的类别数量和输入大小是否和训练时设置的是 保持一致,现在定位到nanodet.cpp中NanoDet::decode_infer函数 中如下的if语句出错,我将每一次的值score打印出来,但是每一次报错的idx都不一样,且错误都是Segmentation fault (core dumped) const float* scores = cls_pred.row(idx) if (scores[label] > score) { score = scores[label]; cur_label = label; }

RangiLyu commented 3 years ago

这个问题一般来说是越界了,可以检查一下模型最后输出的blob形状和预期的是否一致

cqchenqianqc commented 3 years ago

好的,谢谢!!

TD-wzw commented 3 years ago

I have the same problem as you,You go to download the author's ncnn weight file can run, but with their own training weight file or their own conversion income can not

TD-wzw commented 3 years ago

I have the same problem as you,You go to download the author's ncnn weight file can run, but with their own training weight file or their own conversion income can not

TD-wzw commented 3 years ago

I have checked that the output shape of the file I converted by myself is different from that provided by the author. Why

hrishikeshps94 commented 3 years ago

I'm also having the same error, it works perfectly with authors weight file when img_size is 320, but when the img_size is changed the above mentioned error occurs. For a different input shape should we convert the weight to ncnn format with specific img_size? Is it necessary to have fixed input_size when the model is converted from pytorch-->onnx-->ncnn?

RangiLyu commented 3 years ago

I'm also having the same error, it works perfectly with authors weight file when img_size is 320, but when the img_size is changed the above mentioned error occurs. For a different input shape should we convert the weight to ncnn format with specific img_size? Is it necessary to have fixed input_size when the model is converted from pytorch-->onnx-->ncnn?

The Interp layer's parameters in the ncnn model are set to fix values when converting from onnx. If you want to use a dynamic input shape, you need manually modify the .param file. You can refer to the nanodet model in the ncnn repo. https://github.com/nihui/ncnn-assets/blob/master/models/nanodet_m.param https://github.com/Tencent/ncnn/blob/master/examples/nanodet.cpp

Also, notice that the input shape must be divisible by 32.

cqchenqianqc commented 3 years ago

I'm also having the same error, it works perfectly with authors weight file when img_size is 320, but when the img_size is changed the above mentioned error occurs. For a different input shape should we convert the weight to ncnn format with specific img_size? Is it necessary to have fixed input_size when the model is converted from pytorch-->onnx-->ncnn?

Img_size is 608 in my experiment,when cls_pred.row(idx) in void NanoDet::decode_infer(ncnn::Mat cls_pred, ncnn::Mat dis_pred, int stride, float threshold, std::vector<std::vector>& results) may occurs bounds errors. for (int q=0; q<cls_pred.c; q++) { const float* ptr = cls_pred.channel(q); for (int y=0; y<cls_pred.h; y++)

        {
    std::cout << "cls_pred.c: " << cls_pred.c <<" .h:"<< cls_pred.h<<" .w:"<< cls_pred.w<<"\n"<<std::endl;

            for (int x=0; x<cls_pred.w; x++)
            {
                printf("%f ", ptr[x]);
            }
            ptr += cls_pred.w;
            printf("\n");
        }
        printf("------------------------\n");
    }

instead of
const float* scores = cls_pred.row(idx); may be can solve the problem

DuZzzs commented 3 years ago

In my case, because main.cpp set the size of resize to 320x320, change it to your own model size. Run successfully resize_uniform(image, resized_img, cv::Size(input_w, input_h), effect_roi);

robofisshy commented 3 years ago

In my case, I use the confg/nanodet-m.yml train the COCO dataset, no modification in model. But after the deploy pipeline, .pth->.onnx->sim.onnx->ncnn.param, ncnn.bin->ncnn-sim.param, ncnn-sim.bin, my output layers are differnt from author's model. Screenshot from 2021-03-18 16-13-17 I got similar fault as @cqchenqianqc , after I change the heads_info(like upon image), it works. You can check your .param, maybe the same problem @cqchenqianqc

cqchenqianqc commented 3 years ago

In my case, I use the confg/nanodet-m.yml train the COCO dataset, no modification in model. But after the deploy pipeline, .pth->.onnx->sim.onnx->ncnn.param, ncnn.bin->ncnn-sim.param, ncnn-sim.bin, my output layers are differnt from author's model. Screenshot from 2021-03-18 16-13-17 I got similar fault as @cqchenqianqc , after I change the heads_info(like upon image), it works. You can check your .param, maybe the same problem @cqchenqianqc

thanks,I try it