gadomski / cpd

C++ implementation of the Coherent Point Drift point set registration algorithm.
http://www.gadom.ski/cpd
GNU General Public License v2.0
385 stars 122 forks source link

Difficulty Parallelizing with OpenMP #130

Closed sudomakeinstall closed 6 years ago

sudomakeinstall commented 6 years ago

All--

Thanks so much to everyone who has worked on this library! I'm having some difficulty parallelizing the code, however, and was hoping someone might be able to point me in the right direction. Here's what I've done so far:

  1. I've compiled fgt with the following options:

    BUILD_SHARED_LIBS: ON
    CMAKE_CXX_FLAGS: -std=c++14 -fopenmp
    WITH_OPENMP: ON
  2. I've compiled cpd with the following options:

    BUILD_SHARED_LIBS: ON
    CMAKE_CXX_FLAGS: -std=c++14 -fopenmp
    WITH_FGT: ON
  3. I've compiled my own project with:

    CMAKE_CXX_FLAGS: -std=c++14 -fopenmp
  4. I've set OMP_NUM_THREADS=$(nproc) in my ~/.bashrc.

  5. I've checked (at the top of my test program) that 8 threads are visible to OpenMP:

    #pragma omp parallel
    {
    std::cout << omp_get_num_threads() << std::endl; // prints 8, 8 times
    }

Despite these steps, the program still only appears to be using one core at a time when I run a test program:

cpu-use

Any ideas what I might be doing wrong?

Best, and thanks,

--Davis

P.S. I'm using g++ version 7.2.0 on Ubuntu 16.04.

gadomski commented 6 years ago

cpd doesn't have any OpenMP support itself, all OpenMP stuff is in fgt. Are you sure your test program is using GaussTransformFgt and not GaussTransform?

If you confirm that you are using GaussTransformFgt and are still seeing incorrect behavior, the problem is most likely in the fgt library and not cpd. Let me know what you find.

(and sorry about the delay, thanks for your patience!)

sudomakeinstall commented 6 years ago

@gadomski Thanks very much for the response--it looks like I must have missed one of my own steps, because after I deleted, rebuilt, and reinstalled all the packages (FGT, CPD, and my own) it parallelizes as expected. Sorry for the trouble, and thanks again for your work putting this together!

gadomski commented 6 years ago

No worries, thank you!