Open e-kayrakli opened 6 years ago
@ben-albrecht and @mppf: here is a list of things that I have been thinking in terms of improving NPBs in Chapel. I am sure it is not complete and overlooks bunch of issues that were not immediately clear to me. Also, there may be some additional implementations I was unable to find with some naive grepping.
@bradcray: Ben told me that you had some thoughts/concerns regarding NPBs, I am wondering what you think..
All: Feel free to modify the original post as you see appropriate.
Here's a rundown of my reluctances, most of which relate to the notion of having this be a GSoC project rather than the goal of having a "blessed" version of NPB benchmarks (which I'm supportive of):
So that leaves just FT as being an obvious place to spend some time at present for me. In the conversation where we were talking about this as a possible GSoC project, my proposal was that I thought getting a full suite of Intel PRK benchmarks seemed like it might be more satisfying as a project and more compelling/important to study.
All sounds right to me... My urge was to have a more or less full suite of well-accepted set of benchmarks in the HPC community, especially given that PRK are still not that common and arguably much smaller and abstract in terms of the problems they address (i.e. DGEMM is also in every library) as compared to NPB. But I understand your concerns regarding its fit to GSoC.
I think I can chip away some of the tasks in the list every now and then.
Here is a meta-issue to track progress on the implementations of NAS Parallel Benchmarks in Chapel.
General Tasks
[ ] Create a central location in the test suite with all relatively acceptable implementations.
There is already
test/npb
but it contains uncompilable/unoptimized codes. This folder can be reorganized. Or, once acceptable implementations reach a critical mass they can go in thetest/release/examples/benchmarks
.[ ] Create a uniform coding/running style across the benchmarks (at least within
test/npb
, or wherever)Chapel does not yet have a style guide (#7417) but I believe there are some unwritten rules. At least the coding style in the modules should be followed.
[ ] Validate implementations with respect to specifications and reference MPI implementations.
[ ] Check all the accepted benchmarks run with all problem sizes
[ ] Create separate Github issues for specific problems.
Tasks for Specific Benchmarks
EP - Embarassingly Parallel
test/npb/ep/
Seemingly one of the solid ones. Nothing specific as of yet, besides anything that needs to be done for any of the general tasks above.
FT - Fast Fourier Transform
test/npb/ft/
MG - Multigrid
test/users/npadmana/npb-mg/
[ ] An elegant version without
localAccess
calls.(I highly doubt
localAccess
is a long-term solution to whatever bug/unimplemented feature it circumvents)test/npb/mg/
This directory is
NOTEST
ed and not compilable (despite the filenames). These sketches can be leveraged in deriving an elegant implementation.CG - Conjugate Gradient
test/npb/cg/bradc
Contains a number of different implementations by Brad. README gives a nice rundown.
[ ] Sort out the versions to see if the language is mature enough to run one of the more elegant implementations efficiently.
[ ] Buffered sparse domain creation.
I had noticed in the past that this is slowing down the initialization significantly as indices are added in an unoptimized fashion. Ideally, this can be solved in the sparse domain implementation (This can be a separate issue).
IS - Integer Sort
test/npb/is
[ ] Investigate two versions implemented here to develop performant version(s)
[ ] Distributed implementation
BT - Block Tridiagonal Solver
SP - Pentadiagonal Solver
LU - Lower-Upper Gauss-Siedel Solver
References
The ultimate hub for NPB resources:
https://www.nas.nasa.gov/publications/npb.html