Closed skyreflectedinmirrors closed 6 years ago
Currently the kernel driver looks like:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) species_rates_driver(int const problem_size, __global double const *__restrict__ P_arr, __global double const *__restrict__ phi, __global double *__restrict__ dphi, __global double const *__restrict__ rwk)
While the kernel itself is:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) species_rates(__global double *__restrict__ rwk)
where the local versions of P, phi and dphi have been compressed in to the rwk vector.
We need to modify the kernel to look like:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) species_rates(__global double const *__restrict__ P_arr, __global double const *__restrict__ phi, __global double *__restrict__ dphi, __global double const *__restrict__ rwk)
for simplicity / clarity, even if those arrays are part of rwk in the driver.
Additionally -- cleaning up the pointer unpacks to those necessary for the owning kernel & excluding the P, phi and dphi unpacks.
Fixed by work on the old kernel_gen_rewrite branch (and on-going from aaff8e587856d89c69422bacbd03c5b2d98fac51)
Currently the kernel driver looks like:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) species_rates_driver(int const problem_size, __global double const *__restrict__ P_arr, __global double const *__restrict__ phi, __global double *__restrict__ dphi, __global double const *__restrict__ rwk)
While the kernel itself is:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) species_rates(__global double *__restrict__ rwk)
where the local versions of P, phi and dphi have been compressed in to the rwk vector.
We need to modify the kernel to look like:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) species_rates(__global double const *__restrict__ P_arr, __global double const *__restrict__ phi, __global double *__restrict__ dphi, __global double const *__restrict__ rwk)
for simplicity / clarity, even if those arrays are part of rwk in the driver.
Additionally -- cleaning up the pointer unpacks to those necessary for the owning kernel & excluding the P, phi and dphi unpacks.