Closed sjsprecious closed 5 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 94.66%. Comparing base (
570f6ec
) to head (50f50ae
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This PR moves the data movement of array of indices for the diagonal elements of Jacobian matrix to the constructor/destructor of
CudaRosenbrocksolver
class.In addition:
AlphaMinusJacobian
function. This PR moves this operation to theAddJacobianTermsKernel
function so the CUDA kernel now returns-J
ranther thanJ
matrix. The CPU version is updated as well. The current implementation works only when the initial Jacobian matrix is set to zero for all the elements.AlphaMinusJacobianKernel
function is revised to achieve more parallelism on GPU.All the 41 tests passed on Derecho's A100 GPU with
nvhpc/23.7
.Fix #391