Closed pbartholomew08 closed 2 months ago
I think this branch is ready to go
Just a handful of very minor comments, otherwise lgtm
I think all resolved - I've left the larger conversation about using reordering unresolved in case there's anything there I've missed.
I've added tests (OMP only, however it should be relatively easy to adapt to CUDA) for the scalar produce and sum into x implementations (vecadd appears to be tested indirectly through the timestepping tests).