QMCPACK / miniqmc

QMCPACK miniapp: a simplified real space QMC code for algorithm development, performance portability testing, and computer science experiments
Other
27 stars 35 forks source link

Remove loop over blocks of splines? #147

Open markdewing opened 6 years ago

markdewing commented 6 years ago

The evaluation of splines is nested - one loop over blocks of splines, and an inner loop over the splines in that block. I think this was done to experiment with distributing the spline evaluation across (shared memory) processors. This feels like an implementation detail that should not be in the base miniapp. For the base miniapp, there should be a single loop over splines. A better way to do this is to create a spline interface, and then put the breaking of splines into blocks as one implementation behind that interface.

ye-luo commented 6 years ago

This is the advantage of miniQMC over QMCPACK. There are many use of this feature for exploration.

markdewing commented 6 years ago

So the nested evaluation of splines is general enough to keep in the base miniapp?

ye-luo commented 6 years ago

I think this is general. In fact, thinking about splitting the walkers, this feature provide another layer of parallelism. The QMCPACK CUDA code has some similar feature although only the computing not the memory is chunked.

lshulen commented 6 years ago

Regarding the loop over blocks, one nice thing about it is that it helps with cache pressure as described in the IPDPS paper (or maybe SC). It turns out to be a good feature even if you are not letting different processing elements handle different blocks.

Arguing for the simple loop is that it will be easier for others to understand. The benefit of the loop over blocks is that it is well studied and generally turns out to be the best implementation for many platforms. I might suggest that the _ref version could have the simple loop that Mark suggests and the standard implementation retain the loop over blocks. This does imply some coupling of the data types as well, but this could also be handled easily.