slow runtimes for interpolate/voronoi

lanl / LaGriT

Los Alamos Grid Toolbox (LaGriT) is a library of user callable tools that provide mesh generation, mesh optimization and dynamic mesh maintenance in two and three dimensions.

https://lanl.github.io/LaGriT/

Other

122 stars 49 forks source link

slow runtimes for interpolate/voronoi #136

Open daniellivingston opened 5 years ago

daniellivingston commented 5 years ago

Subroutine intrp_gtg for the Voronoi case runs at O(n^2).

@cwgable has found that the issue lies in the nearestpoint0 subroutine, which intrp_gtg calls for each element. nearestpoint0 allocates and deallocates, for each call, a real*8 array of length mtri:

 83:     length=mtri
 84:     call mmgetblk('distpossleaf',isubname,ipdistpossleaf,
 85:    $     length,2,icscode)

This explains the callgrind profiling, where we've seen mmgetblk dominate runtime for large nelem meshes:

voronoi_parameter_051 mlgi

daniellivingston commented 5 years ago

@cwgable modified the intrp_gtg subroutine to call a new subroutine, nearestpoint1, which does not allocate an array on call, but requires an array of length mtri to be allocated prior to calling.

These changes are on branch new-interpolate. Tertiary testing shows a dramatic speed improvement:

NP	Runtime (new interpolate) (s)	Runtime (old interpolate) (s)
11	0.05824398994445801	0.07340097427368164
21	0.08977389335632324	0.12838387489318848
31	0.1591808795928955	0.5874381065368652
41	0.3121931552886963	2.2899179458618164
51	0.545767068862915	7.575886964797974
61	0.8833439350128174	19.65384817123413
71	1.4143941402435303	46.490236043930054
81	2.0786960124969482	101.2328200340271
91	3.044241189956665	220.3300280570984

where number_of_elements = NP**3.

daniellivingston commented 5 years ago

I am seeing two other subroutines that call nearestpoint0, which may or may not need to be changed:

~/playground/LaGriT-new-interpolate/src new-interpolate*
❯ ag "call nearestpoint0"
dopmat.f
1308:           call nearestpoint0(xq,yq,zq,xs,ys,zs,linkt,sbox,eps,

upscale.f
904:             call nearestpoint0(xq,yq,zq,xs,ys,zs,linkt,sbox,eps,

daniellivingston commented 5 years ago

As of 7ecf4e9978afbcc7778db376171c204b3b25d111, nearestpoint0 has been fully deprecated in favor of nearestpoint1.

~/playground/LaGriT-new-interpolate new-interpolate*
❯ ag "call nearestpoint0" src/*.f*

~/playground/LaGriT-new-interpolate new-interpolate*
❯ ag "call nearestpoint1" src/*.f*
src/dopmat.f
1312:           call nearestpoint1(xq,yq,zq,xs,ys,zs,linkt,sbox,eps,

src/intrp_gtg.f
1292:           call nearestpoint1(xp,yp,zp,xs,ys,zs,linkt,sbox,eps,

src/upscale.f
911:             call nearestpoint1(xq,yq,zq,xs,ys,zs,linkt,sbox,eps,

daniellivingston commented 5 years ago

As of 6df48edc5ef319c8b9fbc6ca6a9ec75c5aaf9610:

nearestpoint0 was deleted and nearestpoint1 was renamed nearestpoint0
work arrays distpossleaf are released at the end of the do-loops enclosing call nearestpoint0.
level01 test suite passes