This PR updates the OpenMP kernels to address an issue with the gather kernel and aligns them closer to their v1.1 implementations. As mentioned in #189, there is still a gap in performance between the current scatter, multiscatter, and sg OpenMP kernels on certain platforms.
β¨ Change Description/Rationale
Remove duplicate gather operation in the gather OpenMP kernel
Align the OpenMP kernels closer to the OpenMP kernels in v1.1
Use the dense_perthread buffers in the scatter and multiscatter OpenMP kernels
π Reviewer Checklist
[ ] All GitHub actions and runners have passed if applicable
[x] Commits are clean and relevant
β PR Checklist
[x] Remove or update the template boilerplate text
[x] Commits are relevant and combined where appropriate
[x] Rebase off spatter-devel
[x] Reviewers Requested
[ ] Projects associated
[x] Commits mention issue and/or PR numbers at the bottom of the message
[x] Relevant issues are linked into the PR
[x] TODOs are completed
[ ] Reviewer checklist is updated
π TODOs
[ ] No additional TODOs for this PR
π Future Work
Performance alignment of the scatter, multiscatter, and sg kernels on certain platforms (Cascade Lake, Ice Lake, Sandy Bridge...)
Overview
This PR updates the OpenMP kernels to address an issue with the gather kernel and aligns them closer to their v1.1 implementations. As mentioned in #189, there is still a gap in performance between the current scatter, multiscatter, and sg OpenMP kernels on certain platforms.
β¨ Change Description/Rationale
dense_perthread
buffers in the scatter and multiscatter OpenMP kernelsπ Reviewer Checklist
β PR Checklist
spatter-devel
π TODOs
π Future Work