patrikhuber / superviseddescent

C++11 implementation of the supervised descent optimisation method
http://patrikhuber.github.io/superviseddescent/
Apache License 2.0
402 stars 188 forks source link

Parallelise CalculateHogDescriptor #40

Closed cpfe532 closed 7 years ago

cpfe532 commented 7 years ago

I am referring to this issue about speed up the process https://github.com/patrikhuber/superviseddescent/issues/31

One point is about Hog feature extraction can be parallel. I make a simple time measurement of that function, which took about 60ms for 4 regressions. I am thinking to improve it using GCD on iOS

__block std::vector<int> hogDescriptorsIdx;
dispatch_apply(LandmarkIndexs.size(), c_queue, ^(size_t k) {
     int i = (int)k;
    ....
     hogDescriptor = hogDescriptor.t(); // now a row-vector
     dispatch_sync(s_queue, ^{
        hogDescriptors.push_back(hogDescriptor);
        hogDescriptorsIdx.push_back(i);
    });
});

cv::Mat sortedHog;
for( int i = 0; i < hogDescriptorsIdx.size(); i++) {
    sortedHog.push_back(hogDescriptors.row(hogDescriptorsIdx.at(i)));
}
hogDescriptors = sortedHog.reshape(0, sortedHog.cols * sortedHog.rows).t();

I only parallelise the outer loop as I think it is thread safe within its loop. However, the result is wrong and I don't know where the problem is. May I have any advise?

patrikhuber commented 7 years ago

Maybe there's an issue with sharing the cv::Mat amongst threads - you could try using a separate one for each thread and then concatenate them at the end? Otherwise I would analyse what exactly is wrong in the result.

patrikhuber commented 7 years ago

Closing this as it's an issue with your code and not this library.