Degraded performance of Polygon_set_2 intersections in multithreaded environment

YanBos commented 5 months ago

Issue Details

Intersecting instances of Polygon_set_2 in an omp parallel for loop leads to degraded performance. Further, if the polygon sets are marked as shared to the omp parallel for loop the application throws. A minimal reproducible example is added below.

Source Code

#include <CGAL/Exact_predicates_exact_constructions_kernel.h>
#include <CGAL/Polygon_2.h>
#include <CGAL/Polygon_with_holes_2.h>
#include <CGAL/Polygon_set_2.h>
#include <CGAL/Real_timer.h>
#include <omp.h>

typedef CGAL::Exact_predicates_exact_constructions_kernel Kernel;
typedef Kernel::Point_2 Point_2;
typedef CGAL::Polygon_2<Kernel> Polygon_2;
typedef CGAL::Polygon_with_holes_2<Kernel> Polygon_with_holes_2;
typedef CGAL::Polygon_set_2<Kernel> Polygon_set_2;

Polygon_with_holes_2 constructPolygonWithSingleHole(double xOffset)
{
    std::vector<Point_2> boundaryPoints = { {0 + xOffset, 0}, {4 + xOffset, 0}, {4 + xOffset, 4}, {0 + xOffset, 4}};
    Polygon_2 outer_rect(boundaryPoints.begin(), boundaryPoints.end());

    std::vector<Point_2> holePoints = { {1 + xOffset, 1}, {1 + xOffset, 3}, {3 + xOffset, 3}, {3 + xOffset, 1}};
    Polygon_2 hole(holePoints.begin(), holePoints.end());

    Polygon_with_holes_2 polygonWithHole(outer_rect);
    polygonWithHole.add_hole(hole);

    return polygonWithHole;
}

Polygon_set_2 constructPolygonSet(double xOffset)
{
    auto first = constructPolygonWithSingleHole(0 + xOffset);
    auto second = constructPolygonWithSingleHole(6 + xOffset);
    Polygon_set_2 polygonSet(first);
    polygonSet.insert(second);

    return polygonSet;
}

double timeIntersection(const Polygon_set_2& first, const Polygon_set_2& second)
{
    CGAL::Real_timer time;
    time.start();
    Polygon_set_2 result;
    result.intersection(first, second);
    time.stop();

    return time.time();
}

int main()
{
    std::size_t rounds = 1e4;
    std::vector<double> times(rounds);

    Polygon_set_2 first = constructPolygonSet(0);
    Polygon_set_2 second = constructPolygonSet(1.5);

    //#pragma omp parallel for default(shared) // throws.
    #pragma omp parallel for default(none) shared(times) firstprivate(rounds, first, second)
    for (int i = 0; i < rounds; i++)
        times.at(i) = timeIntersection(first, second);

    double time = timeIntersection(first, second);
    double mean = std::accumulate(times.begin(), times.end(), 0.0)/(double)times.size();

    std::cout << "Parallel: " << mean << std::endl;
    std::cout << "Sequential: " << time << std::endl;
    std::cout << "Slowdown factor: " << mean/time << std::endl;

    return 0;
}

Results with 1e6 rounds:

Mac (10 cores)

Parallel: 0.000261888
Sequential: 4.41074e-05
Slowdown factor: 5.9375

Linux (64 cores)

Parallel: 0.000671427
Sequential: 0.000104904
Slowdown factor: 6.40039

Environment

Operating system (Windows/Mac/Linux, 32/64 bits): macOs Sonoma
Compiler: clang
Release or debug mode: Release
Specific flags used (if any):
CGAL version: 5.6.1
Boost version: 1.84.0
Other libraries versions if used (Eigen, TBB, etc.): openMp 5.0

lrineau commented 5 months ago

@YanBos Thanks for closing the issue. Sorry we could not help you in time.

Can you please let you know what was the solution?

YanBos commented 5 months ago

@lrineau no worries :) Changing the kernel to CGAL::Simple_cartesian<mpq_class> helped so that I could run my program in a reasonable amount of time. Maybe you have a pointer on why changing the Kernel to the above could have helped?

CGAL / cgal