panda-lab / ncnn-vs2013

基于ncnn的face-landmark
https://github.com/Tencent/ncnn
70 stars 32 forks source link

将模型和代码移植到android平台,结果不对,标的点都错乱了?这是为啥啊 #2

Open 82157402 opened 7 years ago

panda-lab commented 7 years ago

把你计算的代码发一下?

82157402 commented 7 years ago

苦啊!还是不对。 代码是和你一样的,图片也是一样的。归一的函数也修改了。

using namespace std; using namespace cv;

ncnn::Net squeezenet; float mean_vals[3] = { 158.f, 158.f, 158.f };

int main() { //初始化模型,以及分类标签 squeezenet.load_param("/sdcard/landmark.param"); squeezenet.load_model("/sdcard/landmark.bin");

//载入测试图片
const char* imagepath = "/sdcard/3.jpg";
cv::Mat img = cv::imread(imagepath, CV_LOAD_IMAGE_COLOR);

cv::Mat img3;
cvtColor(img, img3, CV_RGB2GRAY);

cv::Mat img2;
img.convertTo(img2, CV_32FC1);

cv::Mat tmp_m, tmp_sd;
float m = 0, sd = 0;
cv::meanStdDev(img2, tmp_m, tmp_sd);
m = tmp_m.at<double>(0, 0);
sd = tmp_sd.at<double>(0, 0);

ncnn::Mat in = ncnn::Mat::from_pixels_resize(img3.data, ncnn::Mat::PIXEL_GRAY, img3.cols, img3.rows, 60, 60);
mean_vals[0] = m;
in.substract_mean_normalize(mean_vals, 0,sd,1);

ncnn::Extractor ex = squeezenet.create_extractor();
ex.set_light_mode(true);

ex.input("data", in);
ncnn::Mat out;

clock_t start, finish;
start = clock();
ex.extract("Dense3", out);
finish = clock();
double totaltime;
totaltime = (double)(finish - start) / CLOCKS_PER_SEC;
printf("run time: %f\n", totaltime);

std::vector<float> feat;

for (int i = 0; i < out.c; i++)
{
    const float* prob = out.data + out.cstep * i;
    feat.push_back(prob[0]);

}
for (int i = 0; i < out.c / 2; i++)
{
    Point x = Point(int(feat[2 * i] * img.rows), int(feat[2 * i + 1] * img.cols));
    cv::circle(img, x, 0.1, Scalar(0, 0, 255), 4, 8, 0);
}

//imshow("m", img);
imwrite("/sdcard/result.jpg", img);

return 0;

}

panda-lab commented 7 years ago

看代码一样的,昨天有人移植成功了,我问一下。

boundles commented 7 years ago

后面怎么样了呢?我也是移植到arm上就错了,在pc上都是好的。

boundles commented 7 years ago

定位到了问题,是mat里的数据归一化有bug

panda-lab commented 7 years ago

@boundles 感谢,如果方便,告诉我一下修改了哪里。

boundles commented 7 years ago

float32x4_t _mean = vdupq_n_f32(mean); for (; nn>0; nn--) { float32x4_t _ptr = vld1q_f32(ptr); _ptr = vsubq_f32(_ptr, _mean); vst1q_f32(ptr, _ptr); ptr += 4; } 这里的_ptr需要除以std.

panda-lab commented 7 years ago

@boundles 我改一下,对应的arm代码我没有改,我的锅。