Closed sergisiso closed 3 years ago
@arporter @rupertford This is ready for review.
@arporter Ready for the next review. I updated the README with the libgfortran note but not much about the general implementation because the description is still valid. Note that some implementation details will change once again the the follow up PR #57 .
Thanks for tweaking. I'm happy now :-)
The Views are stored in the infrastructure (dl_esm_inf) device_pointer fields and can be re-used later (potentially also from different invocations). This solves the main performance issue the Views implementation had and closes issue #31
It still requires to copy back to the host some data at the end of each invoke. I am exploring a solution with an infrastructure callback function in another PR.
The current results are below (all impl now have the momentum calculation as hotspot but the GPUs are still far from saturating the memory bandwidth):