ParRes / Kernels

This is a set of simple programs that can be used to explore the features of a parallel platform.
https://groups.google.com/forum/#!forum/parallel-research-kernels
Other
404 stars 106 forks source link

stencil blocking may have foobarred performance... #261

Open jeffhammond opened 7 years ago

jeffhammond commented 7 years ago

Need to investigate but the recent commits have shown a massive regression in some cases.

rfvander commented 7 years ago

That's strange. I used to have tiling for all three stencil implementations, in the good old days of SERIAL, MPI, and OpenMP. But I hardly ever saw a benefit. and it complicated the code, so I eliminated it for all but the serial implementation. In principle it should allow better reuse, but it takes a LARGE grid to see that happen. If performance drops precipitously because of it, there's a pathology (bug).

jeffhammond commented 7 years ago

This was because omitting the blocking argument meant that measurements used star 2 instead of star 4, but we still have to deal with the fact that huge tile sizes led to inadequate parallelism. We should branch on (grid_size/tile_size)^2<num_threads and not bother tiling there.

rfvander commented 7 years ago

Yes, we saw the same with transpose, as you may recall. But I wouldn't do anything automatic. Users should always be allowed to shoot themselves in the foot.

rfvander commented 7 years ago

But maybe we can warn them of the bullet holes.

rfvander commented 7 years ago

I meant to ask you if you ever get requests for box-shaped stencils (instead of star stencils). For the AMR code I effectively had to support that in MPI (too complicated to explain why, and not worth it), and it was actually very easy. I'd like to add that to our MPI variants.

jeffhammond commented 7 years ago

TBB wins big time on KNL because of tiling. Tiling helps for dimension 2000-16000 with star radius 4.

jeffhammond commented 7 years ago

I am the user and I am protecting my feet by making the code disable tiling when it is going to serialize.

jeffhammond commented 7 years ago

My code generator is supposed to support square pattern but there's a bug in it.