LLNL / serac

Serac is a high order nonlinear thermomechanical simulation code
BSD 3-Clause "New" or "Revised" License
178 stars 31 forks source link

Use shared memory to store intermediate values in evaluation_kernel_impl #1104

Open johnbowen42 opened 4 months ago

johnbowen42 commented 4 months ago

https://github.com/LLNL/serac/pull/1026 allocates qf_inputs either on the heap for CPU implementations or in global device memory for GPU implementations. By reducing shared memory usage (potentially by dynamically allocating memory and using a tensor view inside interpolate and integrate), it may be possible to store these in shared memory which will be drastically more performant.