vollmerm / shallow-fission

BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Next steps for ICFP 2016 #8

Closed vollmerm closed 8 years ago

vollmerm commented 8 years ago

We have a little under two months before the ICFP deadline.

Back in November and December I felt like it wasn't clear what exactly we were trying to do, so maybe this time we can try to come up with more concrete goals.

So how about this:

rrnewton commented 8 years ago

Sounds good to me. I just sent a meeting invite.

vollmerm commented 8 years ago

Here's what we have for an abstract in the paper now:

Existing DSLs for GPU programming have successfully focused on generating efficient kernels and minimizing the number of traversals over data, but they have relatively unsophisticated runtime systems. In this paper we provide an in-depth analysis of how well the Accelerate DSL can overlap communication and computation and manage concurrent kernels across multiple devices. We show how to extend Accelerate's runtime to support multiple GPUs and CPUs simultaneously, and we evaluate the tradeoffs between (1) slicing programs into device-specific fragments early, versus (2) just-in-time division of kernel iteration spaces.

We explore global device-specific optimizations as well as individual-kernel optimizations, and we show that this broader class of optimization can have an impact: with a small amount of auto-tuning the early slicing of programs is more effective for XYZ.