The calculation of the ordering of fields in zgmv is incorrect. First spotted by @marsdeno when he tried running with --uvders --scders --vordiv all enabled (I had never thought of testing all three at once!). The benchmark crashed with a "partially present on device" error referring to PGP3A.
UV-like fields and other 3D fields are both stored in the same array: zgmv. zgpuv and zgp3a are pointers pointing to different slices of this array, and these are subsequently passed to inv_trans and dir_trans:
Without this fix (assuming nfld == 1):
jend_vder_EW = 6, jbegin_sc = 6, jend_scder_EW = 8: UV-like fields are arrange in entries 1:6, other 3D fields in entries 6:8. This is obviously wrong because the first entry of zgp3a is pointing to the same address as the last entry of the zgpuv.
The typo is here. jbegin_vder_EW should be jend_vder_EW. With this change the two pointer slices do not overlap.
Lessons learned:
We should add a test case with ALL benchmark arguments specified, but this still wouldn't have caught this bug because...
We should also find a way to execute the tests on a GPU-equipped machine. Is this possible with AC?
The calculation of the ordering of fields in
zgmv
is incorrect. First spotted by @marsdeno when he tried running with--uvders --scders --vordiv
all enabled (I had never thought of testing all three at once!). The benchmark crashed with a "partially present on device" error referring toPGP3A
.UV-like fields and other 3D fields are both stored in the same array:
zgmv
.zgpuv
andzgp3a
are pointers pointing to different slices of this array, and these are subsequently passed toinv_trans
anddir_trans
:(Here)
Without this fix (assuming
nfld == 1
):jend_vder_EW
= 6,jbegin_sc
= 6,jend_scder_EW
= 8: UV-like fields are arrange in entries1:6
, other 3D fields in entries6:8
. This is obviously wrong because the first entry ofzgp3a
is pointing to the same address as the last entry of thezgpuv
.The typo is here.
jbegin_vder_EW
should bejend_vder_EW
. With this change the two pointer slices do not overlap.Lessons learned: