anvaka / ngraph.native

C++ implementation of force-based layout from ngraph
MIT License
51 stars 28 forks source link

Use OpenMP parallel-for during layout step #2

Closed gyscos closed 9 years ago

gyscos commented 9 years ago

On a intel i7-4790k (4 physical cores) with a test set of ~1M objects, I get ~3.5x speedup. On a dual E5-2643 (8 physical cores), I get ~5x speedup.

There is still a significant single-threaded portion, I suspect it is the insertion into the quad tree. I haven't checked, but I'm not sure it's concurrent-ready...

EDIT: Running gprof on a sample run shows that indeed, 10% of the time is spent in QuadTree::insert (82% is spent in QuadTree::updateBodyForce, and 4% in Layout::updateSpringForce).

anvaka commented 9 years ago

Thank you so much for this, Alexandre! It just blew my mind how easy it was to parallelize in C++

Impressive.