luoyetx / face-alignment-at-3000fps

C++ implementation of Face Alignment at 3000 FPS via Regressing Local Binary Features
BSD 3-Clause "New" or "Revised" License
197 stars 120 forks source link

代码速度问题 #19

Open Hardold opened 7 years ago

Hardold commented 7 years ago

我把3000fps运行程序在i5的电脑上运行速度是4ms左右,然而将代码移植到arm-v7 的开发板上运行,运行速度是90ms 左右,正常情况下,电脑和板子的速度比是1:6;请问是什么原因呢?

Hardold commented 7 years ago

经代码跟踪发现以下代码比较耗时 `Mat LbfCascador::GlobalRegressionPredict(const Mat &lbf, int stage) { const Mat &weight = (Mat)gl_regressionweights[stage]; Mat delta_shape(weight.rows / 2, 2); const double w_ptr = NULL; const int lbf_ptr = lbf.ptr(0);

//#pragma omp parallel for num_threads(2) private(w_ptr)
for (int i = 0; i < delta_shape.rows; i++) {
    w_ptr = weight.ptr<double>(2 * i);
    double y = 0;
    for (int j = 0; j < lbf.cols; j++) y += w_ptr[lbf_ptr[j]];
    delta_shape(i, 0) = y;

    w_ptr = weight.ptr<double>(2 * i + 1);
    y = 0;
    for (int j = 0; j < lbf.cols; j++) y += w_ptr[lbf_ptr[j]];
    delta_shape(i, 1) = y;
}
return delta_shape;

} `

有改进的建议吗?